I’m using the XGBoost sklearn API for a classification problem. Unfortunately I can’t share the data but I’ve seen this in some other context already (e.g. here on github).
The initial prediction is set by base_score which is a constant value of 0.5 by default. This should always result in a AUC of 0.5 so why is the AUC on iteration 0 different for validation and training set?
Here’s my configuration:
params = {'min_child_weight': 5,
'gamma': 0.5,
'subsample': 1.0,
'colsample_bytree': 1.0,
'max_depth': 5,
'learning_rate': 0.01,
'scale_pos_weight': 5,
'reg_lambda': 10}
model = xgb.XGBClassifier(n_estimators = n_iter, random_state = 42, **params)
model.fit(X = X_train, y = y_train,
eval_set=[(X_train, y_train), (X_test, y_test)],
eval_metric = ['auc'],
verbose = 20)
|[0]|validation_0-auc:0.91771|validation_1-auc:0.81267
|[20]|validation_0-auc:0.94128|validation_1-auc:0.82144
…