Hello,
It has been a while I am investigating an error I have when trying to implement the reg:tweedie objective myself as a custom objective function in Python and I couldn’t find the solution, neither in other topics, neither in the source code of the library.
The issue I encounter is that my results are terrible with the custom implementation on my dataset but are good on the same dataset with the built-in implementation.
My custom implementation is the following (forcing rho = tweedie_variance_power at 1.5):
def custom_tweedie_grad(
y_true: np.ndarray,
y_pred: np.ndarray,
) -> np.ndarray:
a = -y_true * np.exp((1.0 - 1.5) * y_pred)
b = np.exp((2.0 - 1.5) * y_pred)
grad = a + b
return grad
def custom_tweedie_hessian(
y_true: np.ndarray,
y_pred: np.ndarray,
) -> np.ndarray:
a = -y_true * (1.0 - 1.5) * np.exp((1.0 - 1.5) * y_pred)
b = (2.0 - 1.5) * np.exp((2.0 - 1.5) * y_pred)
hess = a + b
return hess
def custom_tweedie_objective(
y_true: np.ndarray,
y_pred: np.ndarray,
):
return (
custom_tweedie_grad(y_true, y_pred),
custom_tweedie_hessian(y_true, y_pred),
)
I then create a regressor with this custom objective function and call the fit method like this:
self.xgb_model = xgb.XGBRegressor(
booster=self.booster,
validate_parameters=self.validate_parameters,
learning_rate=self.learning_rate,
n_estimators=self.n_estimators,
seed=self.seed,
subsample=self.subsample,
colsample_bytree=self.colsample_bytree,
objective=custom_tweedie_objective,
eval_metric="tweedie-nloglik@1.5",
min_child_weight=self.min_child_weight,
max_depth=self.max_depth,
reg_lambda=self.reg_lambda,
n_jobs=self.n_jobs
)
x_train = self.data["train"]["X"]
y_train = self.data["train"]["y"]
self.xgb_model.fit(x_train, y_train)
The results I obtain on my dataset are the following (using a custom evaluation method):
{'test_mse': 7578884.044259707,
'test_r_squared': -0.12819942553491792,
'train_mse': 6836209.780847064,
'train_r_squared': -0.13481724642997617}
When learning another regressor with the exact same parameters, dataset and evaluation method but using the built-in tweedie objective, the results are the following:
{'test_mse': 4243417.3363827625,
'test_r_squared': 0.3683211178249963,
'train_mse': 2994838.0419956953,
'train_r_squared': 0.5028540712950178}
Does someone know what could be wrong with my implementation ? It is supposed to be the same as the one in the source code as my implementation is based on it.
Thank you very much for your help