How to get base_score from trained booster

I am using model slices as shown here. However the example hardcodes the base_score to 0.5 for training hence it knows the value during prediction. I have a library metric that does not know the base_score during training. The metric gets the trained booster. From using both the scikit interface and the xgb native interface I cannot find how to get the base_score that was calculated or used during training. The model.get_xgb_params() gives back base_score: None, which in this case was the value use in the model training.

Is there a method to get the base_score that was calculated or used during training in a saved model?

Thanks

3 Likes

Hi, I have exactly the same problem and I started doubting if this calculation is actually performed.
The XGBoost documentation states that so-called base_score

is automatically estimated for selected objectives before training. To disable the estimation, specify a real number argument.

It suggests that, given a regression tasks, it calculates average linked to the particular loss function, e.g. mean for RMSE objective. However, as you wrote, the extracted value of the base_score is always equal to 0.5. I’ve checked it using the up-to-date version 1.7.6.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split

import json
import xgboost as xgb

print(xgb.__version__)
# '1.7.6'

# From: https://github.com/dmlc/xgboost/blob/master/python-package/xgboost/testing/updater.py
def get_basescore(model: xgb.XGBModel) -> float:
    """Get base score from an XGBoost sklearn estimator."""
    base_score = float(
        json.loads(model.get_booster().save_config())["learner"]["learner_model_param"][
            "base_score"
        ]
    )
    return base_score

# Preparing data
X, y = make_regression(n_samples=200)
X_train, X_test, y_train, y_test = train_test_split(X, y)

# Training a model
xgb_reg = xgb.XGBRegressor()
xgb_reg.fit(X_train, y_train)

# It seems it always returns... 0.5
# At leats as long as we don't set a custom value manually
# with xgb.XGBRegressor(base_score=np.mean(y_train))
get_basescore(xgb_reg)

How can we explain this (apparent?) discrepancy between the docs and the way it actually works?

I had the same question, and created an issue on the repo here.

Hi, the feature you are referring to was recently added in the master branch and is not yet available in 1.7.x.

I’m having a similar issue: when training a XGBRegressor, I see that all predictions are off exactly by 0.5 from the leaf values, yet .base_score is None. I’m using 1.7.3. What should I do?