How to get best iteration with the early stopping


#1

Hi all,
I’m using the XGBoostRegressor, setting two parameters num_early_stopping_rounds and maximize_evaluation_metrics. Is there any way I can get the num of iter when the early stopping happen?
As show in the source code XGBoost.scala:

if (earlyStoppingRounds > 0) {
          boolean onTrack = judgeIfTrainingOnTrack(params, earlyStoppingRounds, metrics, iter);
          if (!onTrack) {
            String reversedDirection = getReversedDirection(params);
            Rabit.trackerPrint(String.format(
                    "early stopping after %d %s rounds", earlyStoppingRounds, reversedDirection));
            break;
          }

The iter here could be the best num of round for training the model, but I have no idea how to extract this variable.

Thank you so much.


#2

This is the way I do it.

MAX_ITERATION = 2000 ## set this number large enough, it doesn’t hurt coz it will early stop anyway.
model = XGBoostRegressor(
learning_rate =0.036,
n_estimators= MAX_ITERATION,
max_depth=4

            )

model.fit(X_train, y_train, early_stopping_rounds=50)
best_iter = model.best_iteration ## this should give you the number that you need, assuming I understand your question correctly.


#4

Thank you for your reply. I understand how early stopping works, I just wanna extract the best iteration then use it as a parameter to train a new model. model.best_iteration is the python API which might be able to use in the PySpark, but I’m using the scala.
Anyway, thank you so much.


#5

I see. Didn’t realize you are using scala. Yeah, I am using python.

Can you see the attributes from model after model.fit()? I haven’t used xgboost for scala but I imagine you have something like attr(model) something and you can check if you can extract something similar as best_iteration like Python.


#6

The way I’m using now is extractParamMap after training a model, but this API return all parameters except the best_iteration, so I went through the source code to see if this parameter recorded or not, but seems not…