Parallel training of XGBoost classifiers


I am currently training N separate classifiers in series using a for loop. I was wondering if parallelizing the N separate trainings would reduce runtime. If so, how should I go about parallelizing?

I’ve tried to parallelizing using Parallel()delayed() but there seems to be no effect on runtime.


If each training session is already using all of your system threads then there’s no need to further parallelize it. It might even slow down the performance. For GPU you can experiment with whether the pipeline is fully utilized, this is analyzed case by case.

That is super helpful. Thank you so much!