The xgboost val-auc when trainning is not equal to prediction val-auc

Now, I use xgboost for my project, here is my parameter
params = {‘booster’:‘gbtree’,
‘objectvice’:‘binary:logitraw’,
‘max_depth’:14,
‘lambda’:55,
‘subsample’:1,
‘colsample_bytree’:1,
‘min_child_weight’:13,
‘silent’:1,
‘eta’:0.022,
‘nthread’:1,
‘seed’:12,
‘eval_metric’:‘auc’}
and set a watchList when model is trainning
watchList = [(xgb_train, ‘train’),(xgb_val, ‘val’)]
modelXgb = xgb.train(plst, xgb_train, num_rounds, watchList, early_stopping_rounds=1000)

finally I get the best trainning val-auc is 0.92
Stopping. Best iteration:
[306] train-auc:1 val-auc:0.92

but when I manually predict the xgb_val, the result is not very good
sklearn train auc: 1.0
sklearn val auc: 0.89
sklearn train f1: 1.0
sklearn val f1: 0.8952380952380952
sklearn train precision: 1.0
sklearn val precision: 0.8545454545454545
sklearn train recall_score: 1.0
sklearn val recall_score: 0.94

I check the C++ part code, the best iteration prediction is equal to the
valXgb = modelXgb.predict(xgb_val, ntree_limit=modelXgb.best_ntree_limit)

but why the evaluation is different between xgboost using and I get by sklearn?

How did you compute the validation accuracy?

like this:

from sklearn.metrics import roc_auc_score
print "sklearn train auc: " + str((roc_auc_score(listTrainLabels, trainResult)))
print "sklearn val auc: " + str((roc_auc_score(listValLabels, valResult)))

from sklearn.metrics import f1_score, precision_score, recall_score
print "sklearn train f1: " + str(f1_score(trainLabels, trainResult, average=‘binary’))
print "sklearn val f1: " + str(f1_score(valLables, valResult, average=‘binary’))

print "sklearn train precision: " + str(precision_score(trainLabels, trainResult, average=‘binary’))
print "sklearn val precision: " + str(precision_score(valLables, valResult, average=‘binary’))

print "sklearn train recall_score: " + str(recall_score(trainLabels, trainResult, average=‘binary’))
print "sklearn val recall_score: " + str(recall_score(valLables, valResult, average=‘binary’))

Please use binary:logistic instead of binary:logitraw

Here is a mistake, In my original code we use binary:logistic, and the above result is binary:logistic

Did you apply instance weights?

I did not apply weights for instance

In C++ code part, the relative parameter is equal to 1

No idea. Can you post your model and test data?

If it is Ok, I can give it to you in private

@ll1985ll Please e-mail me at chohyu01@cs.washington.edu

Thanks for you, I am applying this thing.

Here is another question, I check xgboost python code, if I set feval in train function, it seems the xgboost will do the optimize the model by your set eval_metric in params or the default eval_metric. In order word, the feval we set in train function will not influence the model training?

@ll1985ll No, evaluation metrics are informational purposes only and does not influence training. Training is determined by the choice of objective function.