I’m attempting to run XGBoost 1.6.1 through the Python scikit-learn
interface on an NVIDIA Tesla K80 GPU using CUDA Toolkit 11.3. The following example code throws a memory allocation error, even though the GPU has plenty of memory available. The traceback indicates that there is no available kernel image.
from xgboost import XGBClassifier
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
X, y = load_wine(return_X_y=True, as_frame=True)
X_train, X_test, y_train, y_test = train_test_split(X,
y,
test_size=0.30,
random_state=0)
classifier = XGBClassifier(tree_method='gpu_hist')
model = classifier.fit(X_train, y_train)
And the traceback (relative paths given for privacy reasons):
Traceback (most recent call last):
File "~/.config/JetBrains/PyCharmCE2022.1/scratches/scratch.py", line 11, in <module>
model = classifier.fit(X_train, y_train)
File "~/lib/python3.8/site-packages/xgboost/core.py", line 532, in inner_f
return f(**kwargs)
File "~/lib/python3.8/site-packages/xgboost/sklearn.py", line 1400, in fit
self._Booster = train(
File "~/lib/python3.8/site-packages/xgboost/core.py", line 532, in inner_f
return f(**kwargs)
File "~/lib/python3.8/site-packages/xgboost/training.py", line 181, in train
bst.update(dtrain, i, obj)
File "~/lib/python3.8/site-packages/xgboost/core.py", line 1733, in update
_check_call(_LIB.XGBoosterUpdateOneIter(self.handle,
File "~/lib/python3.8/site-packages/xgboost/core.py", line 203, in _check_call
raise XGBoostError(py_str(_LIB.XGBGetLastError()))
xgboost.core.XGBoostError: [07:31:30] ../src/c_api/../data/../common/device_helpers.cuh:428: Memory allocation error on worker 0: [07:31:30] ../src/c_api/../data/../common/common.h:46: ../src/common/device_helpers.cuh: 447: cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device
Stack trace:
[bt] (0) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x38f939) [0x7f363eb54939]
[bt] (1) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x3938d3) [0x7f363eb588d3]
[bt] (2) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x3d39ae) [0x7f363eb989ae]
[bt] (3) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x3e7914) [0x7f363ebac914]
[bt] (4) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x3e9790) [0x7f363ebae790]
[bt] (5) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x57d309) [0x7f363ed42309]
[bt] (6) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x20fca8) [0x7f363e9d4ca8]
[bt] (7) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(XGBoosterUpdateOneIter+0x68) [0x7f363e86e688]
[bt] (8) /lib/x86_64-linux-gnu/libffi.so.6(ffi_call_unix64+0x4c) [0x7f36797b08ee]
- Free memory: 11841830912
- Requested memory: 496
Stack trace:
[bt] (0) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x38f939) [0x7f363eb54939]
[bt] (1) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x393d4b) [0x7f363eb58d4b]
[bt] (2) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x3d3ae9) [0x7f363eb98ae9]
[bt] (3) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x3e7914) [0x7f363ebac914]
[bt] (4) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x3e9790) [0x7f363ebae790]
[bt] (5) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x57d309) [0x7f363ed42309]
[bt] (6) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x20fca8) [0x7f363e9d4ca8]
[bt] (7) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(XGBoosterUpdateOneIter+0x68) [0x7f363e86e688]
[bt] (8) /lib/x86_64-linux-gnu/libffi.so.6(ffi_call_unix64+0x4c) [0x7f36797b08ee]
Process finished with exit code 1
As you can see, the free memory far exceeds the requested memory. Running in compute-sanitizer
gives a massive output, but I think the relevant part is extracted here:
> cat sanitizer.out | grep Program
========= Program hit invalid device function (error 98) on CUDA API call to cudaFuncGetAttributes.
========= Program hit invalid device function (error 98) on CUDA API call to cudaGetLastError.
========= Program hit no kernel image is available for execution on the device (error 209) on CUDA API call to cudaLaunchKernel.
========= Program hit no kernel image is available for execution on the device (error 209) on CUDA API call to cudaPeekAtLastError.
========= Program hit no kernel image is available for execution on the device (error 209) on CUDA API call to cudaPeekAtLastError.
========= Program hit no kernel image is available for execution on the device (error 209) on CUDA API call to cudaGetLastError.
========= Program hit no kernel image is available for execution on the device (error 209) on CUDA API call to cudaLaunchKernel.
========= Program hit no kernel image is available for execution on the device (error 209) on CUDA API call to cudaGetLastError.