GPU problem of xgboost 1.4.0-SNAPSHOT

I installed pre-built xgboost for GPU from
I ran the demo to test, the result is GPU Training is slower than CPU Training:

output of
GPU Training Time: 104.22758913040161 seconds

CPU Training Time: 80.01015758514404 seconds

My environment info:
window 10 home
conda 4.9.2
python 3.8.5
xgboost 1.4.0-SNAPSHOT
scikit-learn 0.23.2

GeForce GTX 1660 Ti
CUDA Version: 11.1

Can you check if the GPU is being used? Also, did you try the 1.3.0 version of XGBoost?

thank your reply.
I am sure my gpu is working. I check it from both nvidia-smi and window task manager.

Following your suggestion,I uninstall 1.4 and install 1.3. but It doesn’t look better

GPU Training Time: 101.46787643432617 seconds

CPU Training Time: 76.1423556804657 seconds

xgboost version i installed:

Dear all,
I have the same problem with an NVIDIA GTX 1070 on Ubuntu 20.40.

Using my CPU (AMD Ryzen 9 3900XT) it takes 52 seconds while using GPU it takes 75 seconds. I ran my code using python as follow:,y_train,
eval_set=[(X_train,y_train), (X_test,y_test)],

I am using the latest stable version of XGBOOST: 1.3.1

Do you have any idea about this issue?