Now, I use xgboost for my project, here is my parameter
params = {‘booster’:‘gbtree’,
‘objectvice’:‘binary:logitraw’,
‘max_depth’:14,
‘lambda’:55,
‘subsample’:1,
‘colsample_bytree’:1,
‘min_child_weight’:13,
‘silent’:1,
‘eta’:0.022,
‘nthread’:1,
‘seed’:12,
‘eval_metric’:‘auc’}
and set a watchList when model is trainning
watchList = [(xgb_train, ‘train’),(xgb_val, ‘val’)]
modelXgb = xgb.train(plst, xgb_train, num_rounds, watchList, early_stopping_rounds=1000)
finally I get the best trainning val-auc is 0.92
Stopping. Best iteration:
[306] train-auc:1 val-auc:0.92
but when I manually predict the xgb_val, the result is not very good
sklearn train auc: 1.0
sklearn val auc: 0.89
sklearn train f1: 1.0
sklearn val f1: 0.8952380952380952
sklearn train precision: 1.0
sklearn val precision: 0.8545454545454545
sklearn train recall_score: 1.0
sklearn val recall_score: 0.94
I check the C++ part code, the best iteration prediction is equal to the
valXgb = modelXgb.predict(xgb_val, ntree_limit=modelXgb.best_ntree_limit)
but why the evaluation is different between xgboost using and I get by sklearn?