For those familiar with Tweedie distributed random variables, in particular with small mean for the poisson component of the model as in typical insurance loss applications, I’m curious your thoughts on the efficacy of using weight in the xgb.DMatrix in order to affect the importance of ‘positive’ observations (those with some value observed > 0). Similar to how one would use scale_pos_weight for a binary problem.
I did try it already but I find that the scale of the prediction output seems to be affected in proportion to the weights I am supplying - I get much larger predictions when I scale up the weight for the observations with loss. Before I get in the weeds of debugging or correcting for this, I want to see if the approach even seems sensible from a prediction improvement perspective. I’ve had strong improvements in the past when scaling the weights in binary class problems.
Here is what I did for weights
w <- sum(train$claim_amount == 0) / sum(train$claim_amount > 0)
weights <- if_else(train$claim_amount > 0, w, 1/w )
train_dm_wgt <- xgb.DMatrix(data = train_matrix, label = train$claim_amount,
info = list(weight = weights ))