"No visible GPU is found, setting `gpu_id` to -1"

Hi,

I’m having trouble getting XGBoost to work with Cuda on my GPU (Quadro RTX 8000) on Linux Debian Testing/Bullseye. When running XGBClassifier with parameter tree_method="gpu_hist" the following error pops up:
.../xgboost/src/gbm/gbtree.cc:506: Check failed: common::AllVisibleGPUs() >= 1 (0 vs. 1) : No visible GPU is found for XGBoost.

I built it from source following the steps I found in the documentation and before that, set the environment variables for Cuda:

ln -s /usr/lib/nvidia-cuda-toolkit/ /usr/local/cuda
export PATH=/usr/lib/nvidia-cuda-toolkit/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/lib/nvidia-cuda-toolkit/libdevice${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

git clone --recursive https://github.com/dmlc/xgboost
mkdir build
cd build
cmake .. -DUSE_CUDA=ON

This is the output of cmake regarding Cuda, showing the correct Cuda environment:
– Configured CUDA host compiler: /usr/bin/c++
– The CUDA compiler identification is NVIDIA 11.1.105
– Detecting CUDA compiler ABI info
– Detecting CUDA compiler ABI info - done
– Check for working CUDA compiler: /usr/bin/nvcc - skipped
– Detecting CUDA compile features
– Detecting CUDA compile features - done
– CMAKE_CUDA_ARCHITECTURES: 80-real;80-virtual;75-real;75-virtual;70-real;70-virtual;61-real;61-virtual;60-real;60-virtual;52-real;52-virtual;50-real;50-virtual;35-real;35-virtual;

After that I make -j and from folder “python-package” install pip install --user -e .

Finally, running the benchmark python3 tests/benchmark/benchmark_tree.py results in the same error as stated above.

I already tried the official package from pip and some others on Anaconda (including Rapids), but still getting the same error.

I am grateful for any help! :slight_smile:
Thanks in advance

This is the output of my nvmcc --version:
nvcc: NVIDIA ® Cuda compiler driver
Copyright © 2005-2020 NVIDIA Corporation
Built on Mon_Oct_12_20:09:46_PDT_2020
Cuda compilation tools, release 11.1, V11.1.105
Build cuda_11.1.TC455_06.29190527_0

GPU driver Version: 450.80.02

Are you able to use other programs that use GPUs, e.g. PyTorch?

Hi, thanks for the reply.
I’m getting similar errors with Cuda using PyTorch or TF.

The problem might be with the NVIDIA and Cuda drivers from the Debian repository. I’m going to test my system with the drivers directly installed from NVIDIA and will report back!

Cheers :smiley:

The problem was indeed the GPU driver from the Debian repository. Installing CUDA toolkit 11.2 from NVIDIA website fixed the problem for me.
Cheers