I found in the below tutorial on multi:softprob
, hessian seems to be doubled compared to theoretical value.
(i.e. theoretical diagonal value of hessian should be p (1 - p)
, however, in this tutorial uses 2 * p (1 - p).)
https://xgboost.readthedocs.io/en/stable/tutorials/custom_metric_obj.html
Can we share some theoretical background about this?
(I derived h = p * (1 - p) from cross entropy objective function)