Shap explanation of XGBoost Classifier

Mena · September 13, 2023, 11:09am

Hi there, I have a question regarding shap explanation of XGBoost classifiers. I am not sure what the value of explainer.expected_value mean? And in particular, why after sigmoid transformation, it is not the same with y_train.mean()? Many thanks!

Below is a summary of the code. full code available here: https://github.com/MenaWANG/ML_toy_examples/blob/main/explain%20models/shap_XGB_classification.ipynb

model = xgb.XGBClassifier()
model.fit(X_train, y_train)
explainer = shap.Explainer(model)
shap_test = explainer(X_test)
shap_df = pd.DataFrame(shap_test.values)

# For each case, if we add up shap values across all features plus the expected or base value, we can get the margin for that case, which then can be transformed to get the predicted prob for that case:
np.isclose(model.predict(X_test, output_margin=True),explainer.expected_value + shap_df.sum(axis=1))
# True

But why isn't the below true? What is the meaning of explainer.expected_value for XGBoost classifiers? Thx again!
expit(explainer.expected_value) == y_train.mean()
# False