Hi. I am trying to utilize Google Colab GPUs to speed up some of my work with xgboost. By default, Colab machines are set up with Python 3.7 and Xgboost 0.90. Running !pip install --upgrade xgboost
installs version 1.4. However, when I am trying to run booster.predict(dmat, pred_contribs=True))
in a loop (I am trying to compute SHAP values for different models on the same dataset), it runs for the first 10 iterations or so and then crashes the Colab kernel.
I also run booster.set_param({'predictor': 'gpu_predictor'})
before calling booster.predict(..., pred_contribs=True)
.
I don’t think the crash is related to resource consumption as RAM and GPU memory loads look low. Also, my dataset and model are pretty small and xgboost predicts the SHAP values blazingly fast until it crashes.
I suspect this is some compatibility issue on the Colab end, but was wondering if there may be things I can try to resolve it, as I don’t have a GPU at home. Any advice would be greatly appreciated. Thanks!
UPDATE:
Here is the crash log from the Colab machine:
|Jul 8, 2021, 12:20:35 AM | WARNING | WARNING:root:kernel 1556ff02-ab27-4cf4-9743-414d48877806 restarted|
|---|---|---|
|Jul 8, 2021, 12:20:35 AM | INFO | KernelRestarter: restarting kernel (1/5), keep random ports|
|Jul 8, 2021, 12:20:32 AM | WARNING | what(): device free failed: an illegal memory access was encountered|
|Jul 8, 2021, 12:20:32 AM | WARNING | terminate called after throwing an instance of 'thrust::system::system_error'|
This is happening on a Tesla P100 card and Ubuntu LTS 18.04.