Customized loss function - derivative -- confusion


I am confused with two different guidelines on using customized loss function.

If predicted probability ‘p’ = sigmoid(z)

  1. In, line#25 mentions that gradient of customized loss function should be taken w.r.t 'z

  2. In, gradient is w.r.t 'p’

def gradient(predt: np.ndarray, dtrain: xgb.DMatrix) -> np.ndarray:
‘’‘Compute the gradient squared log error.’’’
y = dtrain.get_label()
return (np.log1p(predt) - np.log1p(y)) / (predt + 1)

Which approach is correct? Please let me know.