I have a few binary_features which as very important for my learning task, along with >20 continuous features. When i look at the feature importance - binary features are not making it to top-k. On debugging I found that xgboost is splitting on continuous features more than binary ones.
In this video - the presenter claims that the treeExtra repo (from Amazon) does this by normalising by entropy while computing the split
I wish to know if this is already a part of xgboost? if not, where should I make these changes in xgboost repo to help the binary_features rank higher. Thank you