How XGboost implement quantile regression

Dear Community,

I want to leverage XGBoost to do quantile prediction- not only forecasting one value, as well as confidence interval. I noticed that this can be done easily via LightGBM by specify loss function equal to quantile loss, I am wondering anyone has done this via XGboost before? My guess is to do this via specify Grads/Hessian in Custom Objective Function, but not sure the right matrix to use here. Can someone help with this? Thanks!

Jackie

Take a look at https://towardsdatascience.com/regression-prediction-intervals-with-xgboost-428e0a018b

Hi @jackie930
Just wondering if you have found a solution for implementing quantile regression with XGBoost. it seems that the solution provided by @hcho3 is not quite reliable/stable (shared by many users).
I wonder why XGBoost does not have a similar approach like the one proposed in Catboost. I have tried both LightGBM and CatBoost. the first although fast but performed badly in term of prediction. CatBoost seems promising in terms of stability.
Would be great if this can be soon implemented in XGBoost.

@maply007 Are you aware of references and papers that discuss how to perform quantile regression with stability? How do you define stability? Admittedly, I did not try out the linked solution myself; I found it via quick googling. I’d like to read more about quantile regression myself and consider implementing in XGBoost in the future.

ps. I’ve recently helped implement survival (censored) regression where the label is of interval form: https://xgboost.readthedocs.io/en/latest/tutorials/aft_survival_analysis.html. I wonder if we can piggy-back on this work to implement quantile regression. Here is detailed derivation of survival regression in XGBoost: https://www.overleaf.com/read/nwmzmzqktpjb

Hi @hcho3
I just looked at the article that you mentioned and it seems to me that the whole approach that he is suggesting is based on an incorrect formula of the split gain (uses sum of the gradient instead of the square of the sum). i’ve ran his notebook and it seems to me that with the correct formula, its approximation of the gradient and the hessian does worse than the original one.