Hi. I am cross-validating the performance of xgboost across several datasets (same features, different time periods), using StratifiedKFold
and early_stopping_rounds
to tune n_estimators
separately for each dataset. Interestingly, I noticed I am getting a better average AUC across datasets when I am setting early_stopping_rounds
to a low number (e.g., 5-10). When I increase early_stopping_rounds
to say 50, I get a lot less spread in AUC between datasets but the average AUC is invariably lower. Also, I noticed something similar when increasing the max_bin
parameter, training with tree_method='GPU_hist'
. Is it possible that I am overfitting when increasing these parameters (early_stopping_rounds
and max_bin
)? Or what I am seeing is likely purely stochastic? Thank you!