I was recently measuring a prediction time of XGB as a part of a research project and I noticed a strange behavior. No matter how many trees I use, it gives me pretty much similar prediction times. I would expect that the time will grow proportionally to the ensemble size. For instance, below are some representative numbers:
note: all times are measured using gettimeofday() in UNIX. Dataset of size 50k with 512 features is used.
So, the questions are:
- Does XGB internally use parallel processing during prediction as well? If so, how can I force it to use a single core and a single thread (i.e., no parallelism at all) in Python interface?
- Are there any “clever” speed up that is used by XGB during prediction? For example, tree’s prediction is computed by matrix-vector product instead of recursive tree traversal… If this is a long answer, you could probably refer me to some papers where it is explained?