I am referring to the xgboost issue https://github.com/dmlc/xgboost/issues/9782 and https://github.com/dmlc/xgboost/issues/3598
My current env: xgboost 2.0.2
From both issues, what I understand are ( correct me if my understanding is wrong):
a. current xgboost version does NOT support composite metrics, that is, my custom_metric function must return value like below:
def custom_metric_customized(predt: np.ndarray, dtrain:xgb.DMatrix):
return "score_name", score # <---- works
but no way to return multiple metrics like
def custom_metric_customized(predt: np.ndarray, dtrain:xgb.DMatrix):
return [("score_name_1", score_1), ("score_name_2", score_2)] # <--- NOT work
b. if user want to track ( not optimized on ) default xgboost metric, we can use disable_default_eval_metric
is True to track default xgboost metric.
c. disable default metric when either eval_metric or feval is specified.
However, I experiment some combinations, but the result is conflicted to what I understand.
Here is what I did :
I made up a customized metric function called eval_metric_accuracy_customized
, this function is to define metric value where I only care the predicted value accuracy on prediction whose corresponding y_true is located in the [15%, 85%] range of y_true (that is, I care the prediction accuracy located in [15%, 85%], if the y_true is outlier, I less care its prediction error) . Obviously, I need to make this metric as large as better, so I should maximize its evaluation (https://xgboost.readthedocs.io/en/stable/python/python_api.html#xgboost.train )
The conflicted observation are :
a. for train
API, the parameter maximize
does not work. From my below example, regardless I set maximize
to True/False, the printed out eval_metric_accuracy_xgboost
for each num_boost_round does not change.
b. regardless I set disable_default_eval_metric
True or not, the only difference is if I set it to False, the printed out evaluation metric will not include rmse
. It does not impact the training procedure and the final prediction score is still the same.
Combined disa
ble_default_eval_metricand the customized metric
eval_metric_accuracy_xgboost( and set
maximize` to True/False), all four experiments shared the same prediction score …
Can you help me to identify where might be the error ? Is it because my customized metric function is not properly defined ?
Below is the code example. And if you can access Google Colab, here is the link for colab to replicate the result : https://colab.research.google.com/drive/1KxzOT25AVUgcDW6GubvmjFsytfhJQeh0?usp=sharing
# Load the data
housing = fetch_california_housing()
X_train, X_test, y_train, y_test = train_test_split(
pd.DataFrame(housing.data, columns=housing.feature_names),
housing.target,
test_size=0.25,
random_state=131,
)
import xgboost as xgb
from typing import List, Tuple
# data
dtrain = xgb.DMatrix(data=X_train, label=y_train, missing=-999999999)
dvalid = xgb.DMatrix(data=X_test, label=y_test, missing=-999999999)
# made up xgboost params
param = {'objective': 'reg:squarederror',
'disable_default_eval_metric': True, # <---------Noticed disable_default_eval_metric is True
'tree_method': 'hist',
'booster': 'dart',
'lambda': 0.5022935723779454, 'alpha': 0.0010591193559734626,
'subsample': 0.7443155004860621, 'colsample_bytree': 0.8049514766470095,
'base_score': 2.0729023397932815,
'eta': 1.146933698699281, 'gamma': 3.1491135525631537, 'max_depth': 8, 'min_child_weight': 60,
'grow_policy': 'lossguide', 'sample_type': 'uniform',
'normalize_type': 'forest', 'rate_drop': 0.0001559101552440383, 'skip_drop': 0.006540833684514242}
# customizd metric
LB = np.percentile(y_train, 15)
UB = np.percentile(y_train, 85)
def eval_metric_accuracy_customized(y_pred, y_true, sample_weight=None, LB=0, UB=np.inf, threshold=0.25):
# cond checks if y_pred are within the tolerance range for y_true which lies between [LB, UB]
# for y_true lies outside of [LB, UB], I less care its corresponding y_pred value.
cond = (y_pred >= (1-threshold)*y_true) & \
(y_pred <= (1+threshold)*y_true) & \
(y_true > LB) & (y_true < UB)
df_true_score_weight_1 = np.where(cond, 1.0, 0.0)
acc = np.mean(df_true_score_weight_1)
# acc indicate the prportion of y_pred fall within the condition of cond
return acc
def eval_metric_accuracy_xgboost(sample_weight=None, LB=0, UB=1500, threshold=0.25, **kwargs):
def eval_metric_accuracy_xgboost_internal(predt: np.ndarray, dtrain: xgb.DMatrix) -> Tuple[str, float]:
y_true = dtrain.get_label()
score = eval_metric_accuracy_customized(y_pred=predt, y_true=y_true,
sample_weight=sample_weight, LB=LB, UB=UB, threshold=threshold)
return "eval_metric_accuracy_xgboost", score
return eval_metric_accuracy_xgboost_internal
eval_metric_accuracy_xgboost_custom = eval_metric_accuracy_xgboost(LB=LB, UB=UB)
# Experiments starts
# test 1 :
# parame disable_default_eval_metric true
# customized metric accuracy set to `maximize`
output = xgb.train(params=param,
dtrain=dtrain,
num_boost_round=200, # set a high number
evals=[(dtrain, "train"),(dvalid, "validation")],
custom_metric=eval_metric_accuracy_xgboost_custom,
maximize=True, # <---------Noticed `custom_metric` specified and metric direction maximize True
# early_stopping_rounds=50,
verbose_eval=True
)
preds = output.predict(data=dvalid)
print("Test RMSE:", mean_squared_error(y_test, preds, squared=False)) # Test RMSE: 0.5912299076469258
# test 2
# parame disable_default_eval_metric true
# customized metric accuracy set to NOT `maximize`
output_maximize_accuracy_false = xgb.train(params=param,
dtrain=dtrain,
num_boost_round=200,
evals=[(dtrain, "train"),(dvalid, "validation")],
custom_metric=eval_metric_accuracy_xgboost_custom,
maximize=False, # <---------Noticed `custom_metric` specified and metric direction maximize False
verbose_eval=True
)
preds_output_maximize_accuracy_false = output_maximize_accuracy_false.predict(data=dvalid)
print("Test RMSE:", mean_squared_error(y_test, preds_output_maximize_accuracy_false, squared=False)) # Test RMSE: 0.5912299076469258
# test 3
# parame disable_default_eval_metric false
# customized metric accuracy set to `maximize`
param_disablt_default_eval_false = {'objective': 'reg:squarederror',
'disable_default_eval_metric': False, # <---------Noticed disable_default_eval_metric was True before, now it is False
'tree_method': 'hist',
'booster': 'dart',
'lambda': 0.5022935723779454, 'alpha': 0.0010591193559734626,
'subsample': 0.7443155004860621, 'colsample_bytree': 0.8049514766470095,
'base_score': 2.0729023397932815,
'eta': 1.146933698699281, 'gamma': 3.1491135525631537, 'max_depth': 8, 'min_child_weight': 60,
'grow_policy': 'lossguide', 'sample_type': 'uniform',
'normalize_type': 'forest', 'rate_drop': 0.0001559101552440383, 'skip_drop': 0.006540833684514242}
output_maximize_accuracy_true_disable_default_false = xgb.train(params=param_disablt_default_eval_false,
dtrain=dtrain,
num_boost_round=200,
evals=[(dtrain, "train"),(dvalid, "validation")],
custom_metric=eval_metric_accuracy_xgboost_custom,
maximize=True, # <---------Noticed `custom_metric` specified and metric direction maximize True
verbose_eval=True
)
preds_output_maximize_accuracy_true_disable_default_false = output_maximize_accuracy_true_disable_default_false.predict(data=dvalid)
print("Test RMSE:", mean_squared_error(y_test, preds_output_maximize_accuracy_true_disable_default_false, squared=False)) # Test RMSE: 0.5912299076469258
# test 4
# parame disable_default_eval_metric false
# customized metric accuracy set to NOT `maximize`
output_maximize_accuracy_false_disable_default_false = xgb.train(params=param_disablt_default_eval_false,
dtrain=dtrain,
num_boost_round=200,
evals=[(dtrain, "train"),(dvalid, "validation")],
custom_metric=eval_metric_accuracy_xgboost_custom,
maximize=False, # <---------Noticed `custom_metric` specified and metric direction maximize False
verbose_eval=True
)
preds_output_maximize_accuracy_false_disable_default_false = output_maximize_accuracy_false_disable_default_false.predict(data=dvalid)
print("Test RMSE:", mean_squared_error(y_test, preds_output_maximize_accuracy_false_disable_default_false, squared=False)) # Test RMSE: 0.5912299076469258