Feature Importance Based on optimal number of trees?


Let’s say we fit a model that includes early stopping with a validation set and we find the best_ntree_limit is 1,000 but I set early_stopping_rounds to be 500. Therefore our model object has 1,500 trees encoded.

We would like to get feature importances back from this model but only for the first 1,000 trees - the optimal model - and not the overfit model with 1,500 trees. Is that possible in either Python or R API without having to calculate ourselves?



XGBoost only safe the last Model, not the best one (best ntree). So we have to run it again in exactly 1000 tree.


In R, you can do it through xgb.model.dt.tree by setting n_tree_first, and then do aggregation yourself.
You can check source code of xgb.importance, xgb.dump, and xgb.model.dt.tree.