Get_num_boosting_rounds() and extract number of trees disagree


#1

I have an XGBClassifier model saved in pickle. I set n_estimators = 200 during my training. The training is done by batches after batches via training continuation. After loading the saved model, I can use model.get_num_boosting_rounds() to get 200. However, when I tried to extract trees from boosters. There are 800 trees extracted. model = xgb_model.get_booster() dump_list = model.get_dump() num_trees = len(dump_list) This is the code I used to get number of trees. Besides, I use model.save_raw() to create a memory buffer. Then, reading in trees gives me 800, too. As far as I know, number of rounds should be the same with number of trees. I am doing binary classification, so I am really confused.


#2

Did you set num_parallel_tree?


#3

No, I didn’t. Is it because I trained 4 batches? So the total trees will be 4*200?


#4

What do you mean by 4 batches?


#5

I separate the data into 4 batches and after one batch run, I saved the model and then loaded it and continued training on the saved model on next batches data.


#6

In that case, yes, then you’d have 800 trees.