Does `Booster.predict` use multiple threads by default?

zzhengnan · March 19, 2021, 4:18pm

Hi all.

I’m trying to figure out whether the default behavior for Booster.predict in the Python API is to utilize all available threads. I came across a number of posts on the default threading behavior specifically in the context of model training, but couldn’t find the same information for model prediction.

I did find this GitHub comment from 2016, but found the following statement counter-intuitive, given that predict was known to be thread unsafe as documented in this issue among a few others.

the model would by default use all the available threads (as if nthread=0), provided that OMP_NUM_THREADS environment variable wasn’t set.

I tried digging into the C++ source code and even attempted to look into OpenMP but haven’t had much luck, as I have limited experience in either.

Thanks in advance!

hcho3 · March 22, 2021, 4:22am

Yes, the predict() function will use all threads by default. You can verify it by looking at which CPU cores are utilized. To control the number of threads used, you can pass nthread parameter to the Booster object, by

bst.set_param({'nthread': 2})

zzhengnan · March 22, 2021, 1:55pm

Thanks for your reply, Philip. I’m wondering why the default is to use all threads when predict was known to have been thread unsafe prior to v1.1.0 (link to relevant entry in the change log). Could you provide more context on the rationale? Thanks.

hcho3 · March 22, 2021, 4:11pm

The lack of thread safety was due to the use of a global state that made it unsafe to call predict() with the same Booster object from multiple threads. It has nothing to do with whether the single predict() method uses multiple threads or not.

FYI, the thread safety issue has been fixed in #6648 and will be part of the upcoming 1.4.0 release.