I am new to the usage of a custom loss function for a model particularly for Xgboost. I have a binary classification problem which is highly imbalanced and I need to predict the probabilities for the minority class (1). For this the objective function I am using is objective = ‘binary:logistic’. I did built an Xgboost model using the above ojective function and my evaluation metric being the average precision score. The score seems to be decent enough. But now I want to build a custom objective function for the model. So looking after many links and searching online, I used this as my custom objective function:
scale_pos_weight = 75 def obj_func(preds, y_train): weights = np.where(y_train == 1.0, scale_pos_weight, 1) #as I use this parameter in my Xgbclassifier as well - to give weights to the minority class preds = 1.0 / (1.0 + np.exp(-preds)) grad = preds - y_train #gradient - 1st order derivative hess = preds * (1.0 - preds) #Hessian - 2nd order derivative return grad*weights, hess*weights
I am not sure if this implementation is right. My Xgb classifier is defined as:
xgb = XGBClassifier(learning_rate =0.07, n_estimators=1000, max_depth=5, gamma=2, colsample_bytree=0.4, objective= obj_func, scale_pos_weight = 75, seed=27) model_xgb = xgb.fit(X_train, y_train)
After fitting the model I evaluate my model against the validation set using the average precision from sklearn.metrics. The evaluation was decent when I was using the default binary:logistic objective function from the library but after using the custom objective function the average precision has gotten worse (0.03) from (0.65 when using normal objective function). Is there something wrong that I am doing with my objective function or do I need to add something more?