I’m trying to write an XGBoost regressor for a specific use-case where overestimating the predicted value is more desirable than underpredicting it. Therefore, I wrote an underestimate_penalty_factor
variable to use in my custom objective penalizing underestimates more than overestimates. Ultimately, I wanted my objective to be a “weighted” mean square error of the form
Error(pred, label) = d(pred, label) * (pred - label)**2
,
where d(pred, label) = 1
whenever pred >= label
and d(pred, label) = underestimate_penalty_factor
otherwise. In other words, if underestimate_penalty_factor
is set to 1, this is just the square error, if it is set higher, underpredictions are penalized more than overpredictions. My model is initialized as so:
model = XGBRegressor(**other_params, objective=custom_loss)
and as far as I understand it, custom_loss
is expected to receive the predicted values and labels (in this order) and output the gradient and the “hessian” of the objective function to be minimized. So I wrote the function in the following way.
def custom_loss(
predt: np.ndarray,
dtrain: xgboost.DMatrix) -> tuple[np.ndarray, np.ndarray]:
d = predt - dtrain
d[d>0] = 1
d[d<=0] = underestimate_penalty_factor
return d * (predt - dtrain), d
This however produces the exact opposite results than expected, i.e. underestimates seem to be preferred when underestimate_penalty_factor
is set to a high number. The prediction seems to do exactly what I want when I write it in the opposite way, that is
def custom_loss(
predt: np.ndarray,
dtrain: xgboost.DMatrix) -> tuple[np.ndarray, np.ndarray]:
d = predt - dtrain
d[d>0] = underestimate_penalty_factor
d[d<=0] = 1
return d * (dtrain - predt), d
I guess I must be interpreting the custom objective function and its parameters in the wrong way. Could someone please clarify how it is?