Why is sampling_method=gradient_based only supported for gpu_hist?


#1

I train XGB using about 40 CPU cores. It would be interesting to see if I can get a speedup to training by using a small subsample(=0.1) and sampling_method=gradient_based. Why is sampling_method=gradient_based only supported for gpu_hist and not hist?


#2

This is because gpu_hist and hist are independent implementations, and gradient based sampling was first added to gpu_hist.


#3

Are there plans to support it for hist in the future? I’m guessing theoretically it would afford similar speedups to training as “goss” for LightGBM?


#4

I don’t think there is a concrete plan for this yet.