Can you provide some guidance on how to tune gamma parameter? In API docs, its ranges [0, inf). Does it depend on training sample size?
Gamma parameter tuning
A good place to start is to plot the distribution of loss changes over all splits in all trees. Use get_dump()
with with_stats=True
: https://xgboost.readthedocs.io/en/latest/python/python_api.html#xgboost.Booster.get_dump
Set gamma=0 for this step so that all splits would be allowed. See the typical size of loss changes and adjust gamma
appropriately.
In R package, is there a function like get_dump
?
You should use xgb.dump
.