Hi. I am contemplating my next home PC build and trying to figure out what hardware I need to get to optimize xgboost training speed. My main use case currently is tuning parameters with extensive cross-validation on relatively small data subsets. Using a single GPU, the traning-prediction takes only a few seconds per single iteration of the cross-validation loop. I understand that currently there is support for distributing xgboost across several GPUs, using dask. My question is how much speed-up I can expect for my use case, using dask on a multi-GPU setup. For example, if right now it takes the algorithm 2 seconds per validation fold, can I expect it to go down to 1 second with 2 GPUs, 0.5 second with 4 GPUs, etc? Or would it scale more or less linearly on big dataframes only? Thank you.
This is the case. Currently, XGBoost uses multiple GPUs by dividing the training data into multiple subsets and then distributing them to each GPU.