R XGBoost predict result differs from result using xgb.model.dt.tree

Hi,

After training an R xgboost model as described below, I would like to calculate the probability
prediction by hand using the tree that is output by xgb.model.dt.tree().

For a test row, I thought that the correct calculation would use the leaves from all 4 trees as shown here:

    Tree Node  ID Feature    Split  Yes   No Missing      Quality    Cover
 1:    0    0 0-0      V8 0.012865  0-1  0-2     0-1 20.127027500 61.50000
 2:    0    1 0-1    Leaf       NA <NA> <NA>    <NA>  0.009677419 30.00000
 3:    0    2 0-2    Leaf       NA <NA> <NA>    <NA>  0.350769252 31.50000
 4:    1    0 1-0     V15 0.625835  1-1  1-2     1-1 19.353305800 60.54989
 5:    1    1 1-1    Leaf       NA <NA> <NA>    <NA> -0.034775745 30.22977
 6:    1    2 1-2    Leaf       NA <NA> <NA>    <NA>  0.300693214 30.32012
 7:    2    0 2-0    Leaf       NA <NA> <NA>    <NA>  0.098971337 59.27218
 8:    3    0 3-0    Leaf       NA <NA> <NA>    <NA>  0.071213789 58.25556

I expected that the probability for the following input:

#       V8      V15
# -0.93597 -0.51685

would be the result of the following calculation:

logfun <- function(x){1/(1 + exp(x))}
logfun(-sum(0.009677419,-0.034775745,0.098971337,0.071213789))
#0.5362082

Where each of the numbers are the leaf values for each of the 4 trees.

However, the predict() function produces the following:

predict(xgbModel,test_row_1)
#0.5184599

The leaf indices returned by predict(xgbModel,test_row_1,predleaf = T) are c(1,1,0), meaning that the last tree is not considered by the predict function.

Is my approach to hand-calculation correct? Should all 4 trees be considered when calculating the probability?

Thank you.

1 Like

@dbolotov See XGBoost learning-to-rank model to predictions core function?. There is a global bias of 0.5 that gets added to every leaf output, so the prediction result would be

logfun(-sum(0.009677419,-0.034775745,0.098971337,0.071213789)+0.5)

You can remove this bias by setting base_score=0 when training.

@hcho3 Thanks for your response, but I’m not sure if that’s the solution.

I was trying to match the hand-calculation result to the result of predict.xgb.Booster(). After adding the bias, the output of the command logfun(-sum(0.009677419,-0.034775745,0.098971337,0.071213789)+0.5) is 0.4121915, but the output of predict.xgb.Booster() is 0.5184599.

I ran another data point through the model, and got a similar result: the output of predict() seems to be only using 3 trees out of the 4 that are shown by xgb.model.dt.tree().

If predict is called with ntreelimit set to 4 (the number of trees shown by printed by xgb.model.dt.tree), then the result of a hand-calculation does match predict.

Why does predict not use all 4 trees by default?

Thank you.

@dbolotov Are you using DART booster? If so, you will need to specify ntreelimit to use all trees. See https://xgboost.readthedocs.io/en/latest/tutorials/dart.html

1 Like

No, using the default booster=gbtree. Here is the model object:

##### xgb.Booster
raw: 1.5 Kb 
call:
  xgb.train(params = params, data = DTrain, nrounds = 4, watchlist = list(train = DTrain, 
    test = DDev), verbose = 1, early_stopping_rounds = 3, maximize = F, 
    objective = "binary:logistic", eval_metric = "logloss", booster = "gbtree")
params (as set within xgb.train):
  max_depth = "6", eta = "0.3", min_child_weight = "30", objective = "binary:logistic", eval_metric = "logloss", booster = "gbtree", silent = "1"
xgb.attributes:
  best_iteration, best_msg, best_ntreelimit, best_score, niter
callbacks:
  cb.print.evaluation(period = print_every_n)
  cb.evaluation.log()
  cb.early.stop(stopping_rounds = early_stopping_rounds, maximize = maximize, 
    verbose = verbose)
# of features: 32 
niter: 4
best_iteration : 3 
best_ntreelimit : 3 
best_score : 0.638522 
nfeatures : 32 
evaluation_log:
 iter train_logloss test_logloss
    1      0.646767     0.662551
    2      0.613465     0.639944
    3      0.606638     0.638522
    4      0.603164     0.638943

Since you performed early stopping, by default predict() evaluates the first three trees, since best_ntreelimit is 3.

1 Like

Ok I understand, thanks!