Hist 10x slower than exact

XGBoost version: 0.90
System: linux
Cores: 40
How I’m running train: in Jupyter Notebook

I’m training with nthreads=40 on a dataset of size 12M and 48 features. “Exact” mode boosts trees at a rate of 1 tree per 0.2 minutes. With the same hyperparameters, “hist” mode (I’ve only changed “tree_method”) boosts trees at a rate of 1 per 2 minutes (10x slower).

When I inspect the CPU usage, both “exact” and “hist” uses all 40 cores. The CPU usage of “exact” oscillates around 20-100% while the CPU usage of “hist” stays saturated around 100%.

What could be causing this? Could it be I’m not allocating enough memory for Jupyter Notebook?

Any additional information I should provide?