I have trained a multiclass classification model using xgb.train. The model has 4 categories (0 to 3). I am trying to create several variable importance plots (VIP) with ranges (groups) of categories. For example, I would like to create a VIP that includes categories 0 and 1, and another one that includes categories 2 and 3. Looking at the documentation, it says:

(only for the gbtree booster) an integer vector of tree indices that should be included into the importance calculation. If set to NULL, all trees of the model are parsed. It could be useful, e.g., in multiclass classification to get feature importances for each class separately. IMPORTANT: the tree index in xgboost models is zero-based (e.g., use trees = 0:4 for first 5 trees). Further down in the xgb.importance documentation, there is a code example as:

# multiclass classification using gbtree:

nclass <- 3

nrounds <- 10

mbst <- xgboost(data = as.matrix(iris[, -5]), label = as.numeric(iris$Species) - 1,

max_depth = 3, eta = 0.2, nthread = 2, nrounds = nrounds,

objective = “multi:softprob”, num_class = nclass)

# all classes clumped together:

xgb.importance(model = mbst)

# inspect importances separately for each class:

xgb.importance(model = mbst, trees = seq(from=0, by=nclass, length.out=nrounds))

xgb.importance(model = mbst, trees = seq(from=1, by=nclass, length.out=nrounds))

xgb.importance(model = mbst, trees = seq(from=2, by=nclass, length.out=nrounds))

When I try running something like this:

All_1 <- xgb.importance(model = best.model, trees = 0:3) **Versus this:**

All_2 <- xgb.importance(model = best.model)

I get two completely different VIP’s model outputs (shouldn’t they be the same?)

I have also tried using:

All_3 <- xgb.importance(model = best.model, trees = seq(from=0, to=3, length.out=nrounds))

But I receive the following error: Invalid cast, from Number to Integer

Can anyone advise me on the best/correct way to create VIPs by categorical groups using xgb.importance?

Thanks!