Use customized evaluation function but still evaluate on rmse

Yiyiyimu · August 16, 2018, 10:49pm

Migrated from https://github.com/dmlc/xgboost/issues/3598

Hi,

First of all, thank you so much for making xgboost available, it is so great!

The problem is, I try to use customized evaluation function in xgboost, but there is a built-in function listed at the front of the customized one. I also tried the the sample function in custom_objective.py, the result is the same, extra rmse would be at the front, so I’m not sure where is wrong.

I’m still fresh to this, so I only can find in training.py, the output of msg = bst_eval_set.decode() is already contains the extra rmse. Maybe that would be of help.

The code is

import xgboost as xgb

def Prec(preds,dtrain): 
    labels=dtrain.get_label() 
    preds=1.0 / (1.0 + np.exp(-preds))
    return 'MaxPrec', (precision_recall_curve(labels,preds, pos_label=1))[0][0]

dtrain=xgb.DMatrix(X_train,label=y_green_train)
dtest=xgb.DMatrix(X_test,label=y_green_test)
param = {'max_depth': 2, 'eta': 1, 'silent': 1}
num_round = 5
watchlist  = [(dtrain,'train'), (dtest,'test')]
bst = xgb.train(param, dtrain, num_round, watchlist,feval=Prec)

But the result is

[0]	train-rmse:0.162775	test-rmse:0.151532	train-MaxPrec:0.046266	test-MaxPrec:0.02681
[1]	train-rmse:0.142818	test-rmse:0.150354	train-MaxPrec:0.046266	test-MaxPrec:0.02681
[2]	train-rmse:0.124537	test-rmse:0.150097	train-MaxPrec:0.046266	test-MaxPrec:0.02681
[3]	train-rmse:0.110327	test-rmse:0.15799	train-MaxPrec:0.046266	test-MaxPrec:0.02681
[4]	train-rmse:0.101278	test-rmse:0.158164	train-MaxPrec:0.046266	test-MaxPrec:0.02681

Besides, this is a bug I think, that if there is a ‘:’ in the return of customized function,
like return 'MaxPrec', (precision_recall_curve(labels,preds, pos_label=1))[0][0], it would report

D:\Anaconda\lib\site-packages\xgboost\training.py in _train_internal(params, dtrain, num_boost_round, evals, obj, feval, xgb_model, callbacks)
     88                 msg = bst_eval_set.decode()
     89             res = [x.split(':') for x in msg.split()]
---> 90             evaluation_result_list = [(k, float(v)) for k, v in res[1:]]
     91         try:
     92             for cb in callbacks_after_iter:
too many values to unpack (expected 2)

which I think the code thinks there should be a number behind each ‘:’, but there is a default colon so it is unnecessary. Maybe you should mark that when introducing custom evaluation function.

Thank you for your help!

Working environment:
Windows 7_64
python 3.6.3
conda 4.5.9
xgboost 0.80