Customized loss function - derivative -- confusion

hi,
I am confused with two different guidelines on using customized loss function.

If predicted probability ‘p’ = sigmoid(z)

  1. In https://github.com/dmlc/xgboost/blob/master/demo/guide-python/custom_objective.py, line#25 mentions that gradient of customized loss function should be taken w.r.t 'z

  2. In https://xgboost.readthedocs.io/en/latest/tutorials/custom_metric_obj.html, gradient is w.r.t 'p’

def gradient(predt: np.ndarray, dtrain: xgb.DMatrix) -> np.ndarray:
‘’‘Compute the gradient squared log error.’’’
y = dtrain.get_label()
return (np.log1p(predt) - np.log1p(y)) / (predt + 1)

Which approach is correct? Please let me know.

  • logistic regression
    $$
    \hat{y} = Sigmoid(pred)=\frac{1}{1+e^{-pred}}\
    S^{\prime}(x)=S(x)(1-S(x))
    $$

    • loss function

      $$
      -y\log\hat{y}-(1-y)\log(1-\hat{y})
      $$

    • gradient
      $$
      (-\frac{y}{\hat{y}}+\frac{1-y}{1-\hat{y}})S(pred)(1-S(pred))=(\frac{y-1}{\hat{y}-1}-\frac{y}{\hat{y}})\hat{y}(1-\hat{y})=-(y-1)\hat{y}-y(1-\hat{y})=\hat{y}-y
      $$

    • hessian
      $$
      S^{\prime}(pred) = S(pred)(1-S(pred))
      $$

def logistic_regression(predt, dtrain):
    labels = dtrain.get_label()
    one_scalar = np.array([1.0], dtype = np.float32)
    prob = one_scalar / (one_scalar + np.exp(-preds, dtype = np.float32))
    grad = prob - labels
    eps_scalar = np.array([1e-16], dtype = np.float32)
    hess = np.max([prob * (one_scalar - prob), eps_scalar])
    return grad, hess

# use auc as metric
def logistic_regression_evalerror(self, preds, dtrain):
    labels = dtrain.get_label()
    one_scalar = np.array([1.0], dtype=np.float32)
    preds = one_scalar / (one_scalar + np.exp(-preds, dtype=np.float32)) 
    fpr, tpr, thresholds = roc_curve(labels, preds)
    return 'alpha-error', auc(fpr, tpr)