Prediction time is independent of number trees?

armanform · May 27, 2021, 1:43am

I was recently measuring a prediction time of XGB as a part of a research project and I noticed a strange behavior. No matter how many trees I use, it gives me pretty much similar prediction times. I would expect that the time will grow proportionally to the ensemble size. For instance, below are some representative numbers:

n_estimators,prediction_time(ms)
1,0.491
10,0.588
100,0.760
1000,0.799
note: all times are measured using gettimeofday() in UNIX. Dataset of size 50k with 512 features is used.

So, the questions are:

Does XGB internally use parallel processing during prediction as well? If so, how can I force it to use a single core and a single thread (i.e., no parallelism at all) in Python interface?
Are there any “clever” speed up that is used by XGB during prediction? For example, tree’s prediction is computed by matrix-vector product instead of recursive tree traversal… If this is a long answer, you could probably refer me to some papers where it is explained?

hcho3 · May 27, 2021, 6:07pm

Yes, by default XGBoost uses all CPU cores to run prediction. You can either set nthread=1 hyperparameter or set environment variable OMP_NUM_THREADS=1.
No, XGBoost implements tree traversal.