I am considering using xgboost for policy gradient learning, and in policy gradient, the negative log-likelihood of an action gets weighted by the reward which can be negative. Does xgboost care if instance weights are positive or negative?

# Does xgboost support *negative* instance weights?

No, the instances weights must always be positive.

Thanks! To give more context, I am implementing my own custom softmax loss like here:

https://github.com/dmlc/xgboost/blob/master/demo/guide-python/custom_softmax.py

So I should be free to utilize the (negative) instance weights however I like in my custom loss function if my custom loss is the only code that looks at the instance weights.

Are the instance weights accessed in other parts of the xgboost training code?

Thanks again

Probably not. However, XGBoost have not been tested with negative weights, so proceed with your own risk.

Hi,

We (in physics) have been using xgboost with negative weights (some, a few, examples have negative weights for technical reasons), without problems so farâ€¦

except that I just see that with 1.3.0 (which I now get from pip install), an explicit fatal error is now emitted when there are negative weights:

https://github.com/dmlc/xgboost/blob/release_1.3.0/src/data/data.cc#L363

This was introduced by this pull request:

https://github.com/dmlc/xgboost/pull/6115

Was there a strong reason to do this ?

Thanks

David

@dhrou Yes. When we regularize the tree model (to avoid overfitting), we use the hyperparameter `min_child_weight`

to ensure that the sum of the Hessian values for the data points in each tree node is not too small. A negative instance weight will result into a negative Hessian, thus interfering with this hyperparameter.

There are also other parts of XGBoost that assume positive instance weights. For example, all data points with negative Hessian values are ignored by the XGBoost algorithm:

In other words, negative Hessian values are now exclusively used to indicate an exception that would cause a particular data point to be ignored. This is probably not the behavior you intended when using negative weights.