Multiple Evaluation Metrics in xgboost (Python, native)

mgloria · January 22, 2019, 5:01pm

I am starting to work with xgboost and I have read in the Python Package Introduction to xgboost (herelink) that is is possible to specify multiple eval metrics like this:

param['eval_metric'] = ['auc', 'ams@0']

However I do not understand why this is useful, since later on when it comes to the ‘Early Stopping’ section it says:

Note that if you specify more than one evaluation metric the last one in param[‘eval_metric’] is used for early stopping.

I understand the idea of xgboost trying to optimize for one objective metric but I struggle to see how can it optimize (simultaneously?) for 2 different ones.

Also related to the topic, to tune the parameters I have seen there are 2 places to specify the evaluation metric. Do they have to be the same? How would I pass to both of them a custom function?

params = {
    # Parameters that we are going to tune.
    'max_depth':6,
    # Other parameters
    'objective':'binary:logistic',
    'eval_metric':'auc', 
    'silent':1
}

cv_results = xgb.cv(
    params,
    dtrain,
    num_boost_round=n_rounds,
    seed=seed,
    nfold=n_folds,
    metrics={'auc'},
    early_stopping_rounds=10
)

See here an example for how to pass a custom metric to xgboost.train.

thvasilo · January 23, 2019, 11:55am

The training algorithm will only optimize using CV for a single metric.

The eval_metric parameter determines the metrics that will be used to evaluate the model at each iteration, not to guide optimization.

They are only reported and are not used to guide the CV optimization AFAIK.

For the example you gave, 'eval_metric':'auc', in the params dict has the meaning that I said above.
The second time when you provide metrics={'auc'} to the xgb.cv call, these are the metrics that will be reported in the CV process. See the docs for more details.

As for optimizing on two metric at the same time, you can take a look at this scikit-learn example.