Speed of random forests in XGBoost on multiclass classification

I am using random forests in XGBoost to train on a dataset with 12,000 unique labels and 20,000 examples. I follow the guidance on https://xgboost.readthedocs.io/en/latest/tutorials/rf.html. I set the following parameters and leave the other default.
params = {'learning_rate': 1, 'objective': 'multi:softprob', 'eval_metric': 'merror', 'max_depth' : 16, 'subsample': 0.8, 'num_parallel_tree': 100, 'colsample_bynode': 0.8, 'nthread': 1}

However, the speed is extremely slow compared to scikit-learn random forests. If I set max_depth = 64, n_estimators =100, and the others to default, the scikit-learn random forests can build a model with one cpu in ten minutes. But XGBoost cannot build a model in several hours.

I guess the reason of the slowness is the number of labels is too large. Why the training of XGBooost is so slow when I have too many classes?