Confusion about xgboost sklearn api plot_tree()

OnePetrichor · March 15, 2019, 5:27am

I am trying to train one such sample.
X= [ [1 2]
[1 2]
[2 2]]
y = [5, 5 , 8]

And I am trying try to use the code below to train the sample.

reg=XGBRegressor(max_depth=2,learning_rate=1.0, n_estimators=2,silent=False,objective=‘reg:linear’)
reg.fit(X,y)
plot_tree(reg,num_trees = 0)

Then i got two trees below.

Then I got into confusion when I entered a test sample X_test=[1,2], because

reg.predict(X_test) #print score = 4.875

Score with the trees structure above： 0.25 + 4.125 = 4.375

Why? Is there anything wrong with my operation?

hcho3 · March 15, 2019, 5:37pm

There is a global bias of 0.5 that gets added to every leaf output. You can remove this bias by setting base_score=0 when training.

OnePetrichor · March 18, 2019, 4:33am

Thanks for your answer!