Multi-class classification weighting for unbalanced datasets

Hi there,

I’ve read through the docs and forums and I just wanted to get confirmation that:

XGBoost does not support the use of class or sample weights in the XGBClassifier.fit() function (python API), in the way scikit-learn and catboost do?

This is despite the presence of the weight parameter which yields a Parameters: { weight } might not be used. message?

Thanks,

Amadeus

You can pass sample_weight parameter to fit(). https://xgboost.readthedocs.io/en/latest/python/python_api.html#xgboost.XGBRegressor.fit

This also brings up Parameters: { sample_weights } might not be used.

weight and sample_weight yield the same performance metrics as not having either.

From which we can deduct they are all ignored?

Make sure to pass sample_weight to fit(), not the XGBClassifier constructor.

Hi Philip thanks for the response.

I’m using GridSearchCV so tried moving sample_weight into the param_grid parameter, and still no luck.

Here’s the code, note I am passing both weight and sample_weight in, with neither having any effect:

gsc = GridSearchCV(
    estimator=XGBClassifier(random_state=42, weight=weights),
    param_grid={"sample_weights": [sample_weights] },
    cv=5,
    scoring="f1_weighted",
    verbose=1,
    n_jobs=-1,
    refit=True,
)

grid_result = gsc.fit(X_train, y_train)

Not sure if you can use GridSearchCV with sample weights enabled.

Try using sample_weight (without the s) in param_grid.

1 Like

Thanks Philip - good spot.

Unfortunately that doesn’t help. I have managed to implement cross validation using sklearn.model_selection.cross_validate:

clf = XGBClassifier(random_state=42)
cross_validate(clf, X_train, y_train, scoring="f1_weighted", fit_params={ "sample_weight" :sample_weights })

Thanks for your help.