I am using xgboost.Booster().save_model(‘model.json’) for serialisation.
I am looking for a proper way of doing serialisation of large models.
The model that I train is quite large.
During the training with GPU, memory consumption at GPU side is circa 20GB.
I never made a success with dumping such a model
since the memory footprint of doing the serialisation
greedily increases beyond that 20GB,
i.e., No problem with training, failure with the serialisation.
The host memory DRAM has the capacity 64GB.
Since it cannot serialise after the training is done,
I see “Killed.” at Ubuntu terminal during the serialisation.
Currently, I am trying to circumvent the problem by increasing the swap memory
: just increased it to 128GB in order to see the response.
I hope that I still can workaround at the moment with the current approach.
But can there be a better way to keep the memory consumption
at the host-side low during the serialisation?