How does xgboost handle GPU training in a gridsearch?

I am a bit confused about the xgboost gpu support, since I find a lot of old topics, so a few question accumulated:

  1. What is the n_gpu parameter? I do not find it in the docs.

  2. Is it possible to enable multi GPU support in scikit without dask? (I have a lot of different algorithms in a gridsearch and can not do a “special” dask gridsearch for xgboost)

  3. How does xgboost behave in a gridsearch where n_jobs >1? What setting do you recommend for n_jobs?

My understanding is that setting n_jobs to lets say 4 will create 4 threads on my gpu when gpu_hist is enabled. However I found that by doing so, the training process is massively slowing down, even though als my process just accumulate for 50% of the GPU RAM. I do not know what the optimal setting is supposed to be. At n_jobs = 1 the GPU is at 40% processing load, so it feels like some potential is wasted here. The funny thing is when looking at catboost on gpu only n_jobs=1 is possibly in a gridsearch, which instantly fills the memory of the whole gpu - but it also only used 10-30% of the processing power. I use a Pascal Titan X for computing.

n_gpu parameter is no longer used. Currently, you need to use Dask to use multiple GPUs in XGBoost. See https://github.com/dmlc/xgboost/blob/master/demo/dask/sklearn_gpu_training.py for an example.

When you set n_jobs > 1 with gpu_hist in a grid search, you will start multiple training jobs that will use the same GPU. You won’t get much performance benefit, as the multiple training jobs try to use the same GPU. I would recommend setting n_jobs=1 for now.