Parallel training of XGBoost classifiers

salwanbutrus · March 29, 2021, 10:35pm

Hello,

I am currently training N separate classifiers in series using a for loop. I was wondering if parallelizing the N separate trainings would reduce runtime. If so, how should I go about parallelizing?

I’ve tried to parallelizing using Parallel()delayed() but there seems to be no effect on runtime.

Thanks!

jiamingy · April 8, 2021, 11:32am

If each training session is already using all of your system threads then there’s no need to further parallelize it. It might even slow down the performance. For GPU you can experiment with whether the pipeline is fully utilized, this is analyzed case by case.

salwanbutrus · April 8, 2021, 6:45pm

That is super helpful. Thank you so much!