No kernel image for Tesla K80 on xgboost 1.6.1

dave.kielpinski · June 29, 2022, 5:10am

I’m attempting to run XGBoost 1.6.1 through the Python scikit-learn interface on an NVIDIA Tesla K80 GPU using CUDA Toolkit 11.3. The following example code throws a memory allocation error, even though the GPU has plenty of memory available. The traceback indicates that there is no available kernel image.

from xgboost import XGBClassifier
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split

X, y = load_wine(return_X_y=True, as_frame=True)
X_train, X_test, y_train, y_test = train_test_split(X,
                                                    y,
                                                    test_size=0.30,
                                                    random_state=0)
classifier = XGBClassifier(tree_method='gpu_hist')
model = classifier.fit(X_train, y_train)

And the traceback (relative paths given for privacy reasons):

Traceback (most recent call last):
  File "~/.config/JetBrains/PyCharmCE2022.1/scratches/scratch.py", line 11, in <module>
    model = classifier.fit(X_train, y_train)
  File "~/lib/python3.8/site-packages/xgboost/core.py", line 532, in inner_f
    return f(**kwargs)
  File "~/lib/python3.8/site-packages/xgboost/sklearn.py", line 1400, in fit
    self._Booster = train(
  File "~/lib/python3.8/site-packages/xgboost/core.py", line 532, in inner_f
    return f(**kwargs)
  File "~/lib/python3.8/site-packages/xgboost/training.py", line 181, in train
    bst.update(dtrain, i, obj)
  File "~/lib/python3.8/site-packages/xgboost/core.py", line 1733, in update
    _check_call(_LIB.XGBoosterUpdateOneIter(self.handle,
  File "~/lib/python3.8/site-packages/xgboost/core.py", line 203, in _check_call
    raise XGBoostError(py_str(_LIB.XGBGetLastError()))
xgboost.core.XGBoostError: [07:31:30] ../src/c_api/../data/../common/device_helpers.cuh:428: Memory allocation error on worker 0: [07:31:30] ../src/c_api/../data/../common/common.h:46: ../src/common/device_helpers.cuh: 447: cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device
Stack trace:
  [bt] (0) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x38f939) [0x7f363eb54939]
  [bt] (1) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x3938d3) [0x7f363eb588d3]
  [bt] (2) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x3d39ae) [0x7f363eb989ae]
  [bt] (3) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x3e7914) [0x7f363ebac914]
  [bt] (4) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x3e9790) [0x7f363ebae790]
  [bt] (5) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x57d309) [0x7f363ed42309]
  [bt] (6) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x20fca8) [0x7f363e9d4ca8]
  [bt] (7) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(XGBoosterUpdateOneIter+0x68) [0x7f363e86e688]
  [bt] (8) /lib/x86_64-linux-gnu/libffi.so.6(ffi_call_unix64+0x4c) [0x7f36797b08ee]


- Free memory: 11841830912
- Requested memory: 496

Stack trace:
  [bt] (0) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x38f939) [0x7f363eb54939]
  [bt] (1) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x393d4b) [0x7f363eb58d4b]
  [bt] (2) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x3d3ae9) [0x7f363eb98ae9]
  [bt] (3) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x3e7914) [0x7f363ebac914]
  [bt] (4) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x3e9790) [0x7f363ebae790]
  [bt] (5) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x57d309) [0x7f363ed42309]
  [bt] (6) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(+0x20fca8) [0x7f363e9d4ca8]
  [bt] (7) ~/lib/python3.8/site-packages/xgboost/lib/libxgboost.so(XGBoosterUpdateOneIter+0x68) [0x7f363e86e688]
  [bt] (8) /lib/x86_64-linux-gnu/libffi.so.6(ffi_call_unix64+0x4c) [0x7f36797b08ee]



Process finished with exit code 1

As you can see, the free memory far exceeds the requested memory. Running in compute-sanitizer gives a massive output, but I think the relevant part is extracted here:

> cat sanitizer.out | grep Program
========= Program hit invalid device function (error 98) on CUDA API call to cudaFuncGetAttributes.
========= Program hit invalid device function (error 98) on CUDA API call to cudaGetLastError.
========= Program hit no kernel image is available for execution on the device (error 209) on CUDA API call to cudaLaunchKernel.
========= Program hit no kernel image is available for execution on the device (error 209) on CUDA API call to cudaPeekAtLastError.
========= Program hit no kernel image is available for execution on the device (error 209) on CUDA API call to cudaPeekAtLastError.
========= Program hit no kernel image is available for execution on the device (error 209) on CUDA API call to cudaGetLastError.
========= Program hit no kernel image is available for execution on the device (error 209) on CUDA API call to cudaLaunchKernel.
========= Program hit no kernel image is available for execution on the device (error 209) on CUDA API call to cudaGetLastError.

hcho3 · June 29, 2022, 1:48pm

Tesla K80 is a very old card. XGBoost requires Compute Capability 5.2 or higher. Please use a recent card instead.

dave.kielpinski · June 29, 2022, 2:30pm

The docs for 1.6.1 say that Compute Capability >= 3.5 is required: https://xgboost.readthedocs.io/en/stable/gpu/index.html. The K80 supports Compute Capability 3.7.

hcho3 · June 30, 2022, 1:26am

You will need to install XGBoost from the source if you intend to use an old card. The usual method of install (pip) only supports 5.2 and up. Let me update the install doc to indicate this.

czumar · August 19, 2022, 5:43am

Hi @hcho3, is it possible to update the website? https://xgboost.readthedocs.io/en/stable/gpu/index.html still indicates that compute capability > 3.5 is sufficient.