AUC metric doesn't start with 0.5 in iteration 0

daso94 · July 15, 2021, 2:53pm

I’m using the XGBoost sklearn API for a classification problem. Unfortunately I can’t share the data but I’ve seen this in some other context already (e.g. here on github).
The initial prediction is set by base_score which is a constant value of 0.5 by default. This should always result in a AUC of 0.5 so why is the AUC on iteration 0 different for validation and training set?

Here’s my configuration:

params = {'min_child_weight': 5,
 'gamma': 0.5,
 'subsample': 1.0,
 'colsample_bytree': 1.0,
 'max_depth': 5,
 'learning_rate': 0.01,
 'scale_pos_weight': 5,
 'reg_lambda': 10}

model = xgb.XGBClassifier(n_estimators = n_iter, random_state = 42, **params)


model.fit(X = X_train, y = y_train,
          eval_set=[(X_train, y_train), (X_test, y_test)],
          eval_metric = ['auc'],
         verbose = 20)

hcho3 · July 15, 2021, 4:50pm

“Iteration 0” here indicates a model with a single decision tree, so an AUC exceeding 0.5 is expected.