Serialisation of Large models

fordicus · October 17, 2020, 11:11pm

I am using xgboost.Booster().save_model(‘model.json’) for serialisation.
I am looking for a proper way of doing serialisation of large models.

The model that I train is quite large.
During the training with GPU, memory consumption at GPU side is circa 20GB.

I never made a success with dumping such a model
since the memory footprint of doing the serialisation
greedily increases beyond that 20GB,
i.e., No problem with training, failure with the serialisation.

The host memory DRAM has the capacity 64GB.
Since it cannot serialise after the training is done,
I see “Killed.” at Ubuntu terminal during the serialisation.

Currently, I am trying to circumvent the problem by increasing the swap memory
: just increased it to 128GB in order to see the response.
I hope that I still can workaround at the moment with the current approach.

But can there be a better way to keep the memory consumption
at the host-side low during the serialisation?

hcho3 · October 17, 2020, 11:13pm

Please file a bug report at https://github.com/dmlc/xgboost. Make sure to post your script (with training hyperparameters).

fordicus · October 24, 2020, 2:55am

Hi, I am using an older version of XGBoost. I am currently reading the changes in v1.2.0 (https://github.com/dmlc/xgboost/blob/master/NEWS.md) in order to update the version of XGBoost.

The memory explosion and “Killed” with save_model(.json) was circumvented by enlarging the swap size to 128 GB with this older version. But it could not load the model to use afterwards.

I will come back if the problem is persistent with the latest XGBoost release.