Base_margin feature not working when running Cross Validation

brebbles · February 4, 2020, 5:04am

Firstly, I see this topic has been previously raised/closed at the github pull request board here #2006, but I seem to be having the same issue.

I cannot get the base_margin passed through during cross-validation. I can get it working on xgb.train using the exact same code (save for the nfold argument), but the base_margin is not pass to the cross-validation run.

I have provided some code that should hopefully be replicated. For reference I am using xgBoost version 0.90.0.2 via R:

#load xgboost
library(xgboost)
 
#define number of classes, features and label
num_class <- 3
data = as.matrix(iris[, -5])
label =  as.numeric(iris$Species) - 1

#create xgbMatrix
xgbMat <- xgb.DMatrix(data = as.matrix(data),label = as.matrix(label))
watchlist = list(train = xgbMat)

#run initial boost
bst <- xgb.train(data = xgbMat,
                 max_depth = 4, eta = 0.4, nrounds = 50,
                 objective = "multi:softprob", eval_metric = "mlogloss",
                 watchlist = watchlist,
                 num_class = num_class, print_every_n = 10)

That achieves the following multiclass log-loss metrics:

[1]  train-mlogloss:0.636670 
[11] train-mlogloss:0.039032 
[21] train-mlogloss:0.022351 
[31] train-mlogloss:0.018615 
[41] train-mlogloss:0.016476 
[50] train-mlogloss:0.015429

So the best log-loss is 0.015429

Now I predict from this boost and use the margin output as the base margin in a new data matrix:

#output margin from initial boost
pred <- predict(bst, xgbMat, outputmargin = TRUE)

#create xgbMatrix with base margin from initial boost
xgbMatPred <- xgb.DMatrix(data = data,label = as.matrix(label), base_margin = as.matrix(pred))
watchlist = list(train = xgbMatPred)

Training using the new matrix with the base margin, you can see it picks up where the previous boost finished in terms of log-loss:

#boost from initial prediction
bst <- xgb.train(data = xgbMatPred,
                 max_depth = 4, eta = 0.4, nrounds = 50,
                 objective = "multi:softprob", eval_metric = "mlogloss",
                 watchlist = watchlist,
                 num_class = num_class, print_every_n = 10)

[1]  train-mlogloss:0.015345 
[11] train-mlogloss:0.014510 
[21] train-mlogloss:0.013976 
[31] train-mlogloss:0.013575 
[41] train-mlogloss:0.013296 
[50] train-mlogloss:0.013115

However, when I use the same new matrix and parameters in a cross-validation, it seems to ignore the base margin and just starts again from scratch:

#attempt CV from initial prediction
set.seed(1)
bstCV <- xgb.cv(data = xgbMatPred,
              nfold = 5,              
              max_depth = 4, eta = 0.4, nrounds = 50,
              objective = "multi:softprob", eval_metric = "mlogloss",
              # watchlist = watchlist, 
              showsd = FALSE,
              num_class = num_class, print_every_n = 10)

[1]  train-mlogloss:0.642223|test-mlogloss:0.660639
[11] train-mlogloss:0.043816|test-mlogloss:0.158496 
[21] train-mlogloss:0.024552|test-mlogloss:0.182579 
[31] train-mlogloss:0.020419|test-mlogloss:0.199851 
[41] train-mlogloss:0.018195|test-mlogloss:0.211206 
[50] train-mlogloss:0.017033|test-mlogloss:0.217913

The log-loss in round 1 doesn’t carry on from the initial boost - rather it resets to no base margin at all.

I’ve been trying numerous alternatives to this (removing the watchlist parameter, changing between multi:softprob and multi:softmax, different eval_metrics, uninstalling/hard-deleting versions of xgboost). Nothing seems to work.

Can anyone confirm if the fix suggested in #2006 was pushed to the v0.90 CRAN version? Can anyone suggest a fix for this?

Kind regards,