GPU problem of xgboost 1.4.0-SNAPSHOT

I installed pre-built xgboost for GPU from https://xgboost.readthedocs.io/en/latest/build.html
I ran the demo cover_type.py to test, the result is GPU Training is slower than CPU Training:

output of cover_type.py
GPU Training Time: 104.22758913040161 seconds

CPU Training Time: 80.01015758514404 seconds

My environment info:
window 10 home
conda 4.9.2
python 3.8.5
xgboost 1.4.0-SNAPSHOT
scikit-learn 0.23.2

GeForce GTX 1660 Ti
CUDA Version: 11.1

Can you check if the GPU is being used? Also, did you try the 1.3.0 version of XGBoost?

thank your reply.
I am sure my gpu is working. I check it from both nvidia-smi and window task manager.

Following your suggestion,I uninstall 1.4 and install 1.3. but It doesn’t look better

GPU Training Time: 101.46787643432617 seconds

CPU Training Time: 76.1423556804657 seconds

xgboost version i installed:
xgboost-1.3.0_SNAPSHOT+47b86180f6db59f240c6ae86f5cba0d002826574-py3-none-win_amd64.whl

Dear all,
I have the same problem with an NVIDIA GTX 1070 on Ubuntu 20.40.

Using my CPU (AMD Ryzen 9 3900XT) it takes 52 seconds while using GPU it takes 75 seconds. I ran my code using python as follow:

random_search_gpu.fit(X_train,y_train,
early_stopping_rounds=10,
eval_set=[(X_train,y_train), (X_test,y_test)],
eval_metric=‘mlogloss’)

I am using the latest stable version of XGBOOST: 1.3.1

Do you have any idea about this issue?

Thanks.