I am considering using xgboost for policy gradient learning, and in policy gradient, the negative log-likelihood of an action gets weighted by the reward which can be negative. Does xgboost care if instance weights are positive or negative?
No, the instances weights must always be positive.
Thanks! To give more context, I am implementing my own custom softmax loss like here:
So I should be free to utilize the (negative) instance weights however I like in my custom loss function if my custom loss is the only code that looks at the instance weights.
Are the instance weights accessed in other parts of the xgboost training code?
Probably not. However, XGBoost have not been tested with negative weights, so proceed with your own risk.