GPU Exact Greedy algorithm

Hi Community,

I wanted to know why support for GPU exact greedy algorithm was removed from version (1.0.0). The pull request for the same: https://github.com/dmlc/xgboost/pull/4527. Is it because it can’t support larger size datasets?

Another question I had is, I am testing the exact greedy algorithm in 0.82 (on a single NVIDIA Tesla K80 GPU). Is there some dependence on the number of features vs the dataset size? For example, I have a dataset with 28 features and I am able to run the exact greedy GPU model building upto a dataset of size 4.5 GB. I had another dataset with 113 features and I am able to run the exact greedy GPU model building upto a dataset of size of 600MB. Is it the case that since on a GPU each thread block works on a separate feature, there is this dependence between number of features and the dataset size?

Yes, the exact algorithm had issues with the memory usage. Using quantile bins as replacement for floating-point values lets us use smart compression techniques to fit more data on the limited GPU memory. Integers are easier to compress, and with a low number of quantile bins, we can use a narrow integer types.

As for the memory consumption, it is proportional both to the number of features (columns) and data points (rows). So if you have more features, you’ll fit fewer rows.