Why is sampling_method=gradient_based only supported for gpu_hist?

shenkev · March 13, 2020, 12:03am

I train XGB using about 40 CPU cores. It would be interesting to see if I can get a speedup to training by using a small subsample(=0.1) and sampling_method=gradient_based. Why is sampling_method=gradient_based only supported for gpu_hist and not hist?

hcho3 · March 14, 2020, 1:35am

This is because gpu_hist and hist are independent implementations, and gradient based sampling was first added to gpu_hist.

shenkev · March 16, 2020, 4:00pm

Are there plans to support it for hist in the future? I’m guessing theoretically it would afford similar speedups to training as “goss” for LightGBM?

hcho3 · March 20, 2020, 12:08pm

I don’t think there is a concrete plan for this yet.