XGBoost in 8x NVidia A100 GPU instance in AWS and GCP

auv · October 27, 2022, 8:07pm

Hello everyone,

First time posting here

We have a working XGBoost regression model which we are hyperparameter optimizing in a paid Google Colab notebook with GPU support. It is working fine but taking a lot of time. We are thinking to go for the fastest publicly available cloud multi GPU instances and have decided to try out the 8x A100 multi-GPU options both from AWS (p4d.24xlarge or p4de.24xlarge) and GCP. Has anyone successfully utilized 8x A100 GPUs with XGBoost? If so, are there any caveats, comments, concerns and tips? Any links to tutorials or papers? Many thanks in advance! Currently we are using the Python interface.

Warm regards,