Incremental Training with new features and labels

Zongshun96 · August 21, 2023, 10:38pm

Hello,
I am desiging a model to incrementally learn and classify new traffic patterns from a stream of my toy cluster observability data. When testing XGBoost for training continuation with code similar to the one below, I am facing an error due to the new features introduced.

So my question is that why XGBoost cannot support incremental training with new features and labels? And I also saw some previous discussions saying XGBoost doesn’t guarantee performance with training continuation. issue #3055
Is it because Gradient Boosting is based on residuals and training continuation won’t update the existing trees?

Thank you.

import xgboost as xgb
from sklearn.datasets import load_wine
a = load_wine()
data = xgb.DMatrix(a['data'][:, 0:6], a['target']) # data and data2 have different features.
data2 = xgb.DMatrix(a['data'][:, 6:], a['target'])

param = {"max_depth": 2, "eta": 1, "objective": "multi:softmax", 'num_class': 3}
num_round = 2

bst = xgb.train(param, data, num_boost_round=num_round)
bst2 = xgb.train(param, data2, 1, xgb_model=bst) # error raised at this line

../src/learner.cc:1506: Check failed: learner_model_param_.num_feature == p_fmat->Info().num_col_ (6 vs. 7) : Number of columns does not match number of features in booster.

svendaj · August 28, 2023, 1:01pm

To do incremental training you need to have always the same number of features/columns in training data.

Incremental training is about new observations/rows rather than new features. So this modification of your code should fix the issue:

data = xgb.DMatrix(a['data'][:6, :], a['target']) # data and data2 have different features.
data2 = xgb.DMatrix(a['data'][6:, :], a['target'])

Zongshun96 · September 1, 2023, 2:21pm

Hello,

My task to solve is that I need to incrementally train a model with new features and labels, because ideally we don’t want to retrain the complete model each time we see a batch of new labels. I think it is a different task than the incremental training you referred to. Do you know if XGBoost is able to do that?

For now we consider a bagging of models method (train a new XGBoost model for new traffic patterns observed) and we concatenate the predictions together, but we don’t know if that is the right way.

Thank you so much for your insight!

svendaj · September 22, 2023, 4:15pm

What is usually defined as incremental training is using xgb_model= parameter, which requires same shape of training data. I am not aware of any other possibility.

Difficult to judge if your approach with bagging will work (it might). It depends on data modelling to tell you more. Try to convert new features to additional samples e.g. by adding feature id and depivoting new dataset. But this would mean changing training data shape and I do not know if it is possible.

Anyway good luck!