Loading old xgboost model into new version

tristan · January 21, 2021, 1:31am

I have an old model of xgboost trained in version 0.90, and I would like to translate it to 1.2.1. Specifically, I have a Booster object in Python. I have seen the documentation (https://xgboost.readthedocs.io/en/latest/tutorials/saving_model.html), which suggests there are two ways to update the version:

Call booster.save_model in version 0.90, and then call booster.load_model in version 1.2.1
Use the script https://github.com/dmlc/xgboost/blob/master/doc/python/convert_090to100.py, which will translate a pickle of an XGBClassifier object.

Neither of these solutions works. To reproduce the problem, I trained a new XGBClassifier on synthetic data in version 0.90, and tried to translate it to version 1.2.1.

In an environment using version 0.90:

import pandas as pd
import numpy as np
import xgboost as xgb
import pickle
print(xgb.__version__)
test_df = pd.DataFrame({'feature':np.random.rand(100),'label':[0]*50+[1]*50})
clf = xgb.XGBClassifier()
clf.fit(test_df[['feature']],test_df['label'])
pickle.dump(clf, open("test_model.pkl", "wb"))
booster = clf.get_booster()
booster.save_model('test_model.json')

I copied convert_090to100.py to the local directory and ran this in bash using the same environment:

python convert_090to100.py --old-pickle test_model.pkl

Then, switching to an environment with xgb version 1.2.1:

import xgboost as xgb
import pickle
print(xgb.__version__)
booster = xgb.Booster()
booster.load_model('test_model.json')

with open('xgboost_native_model_from_test_model.pkl-0.bin','rb') as f:
    clf = pickle.load(f)

Error trace back for first method:

---------------------------------------------------------------------------
XGBoostError                              Traceback (most recent call last)
<ipython-input-2-dff543b11bf4> in <module>
      3 print(xgb.__version__)
      4 booster = xgb.Booster()
----> 5 booster.load_model('test_model.json')

~/anaconda3/envs/python3/lib/python3.6/site-packages/xgboost/core.py in load_model(self, fname)
   1602             # from URL.
   1603             _check_call(_LIB.XGBoosterLoadModel(
-> 1604                 self.handle, c_str(os_fspath(fname))))
   1605         elif isinstance(fname, bytearray):
   1606             buf = fname

~/anaconda3/envs/python3/lib/python3.6/site-packages/xgboost/core.py in _check_call(ret)
    186     """
    187     if ret != 0:
--> 188         raise XGBoostError(py_str(_LIB.XGBGetLastError()))
    189 
    190 

XGBoostError: [01:28:36] ../src/c_api/c_api.cc:612: Check failed: str[0] == '{' (

Error traceback for second method:

---------------------------------------------------------------------------
UnpicklingError                           Traceback (most recent call last)
<ipython-input-3-d1b28b8b2924> in <module>
      1 with open('xgboost_native_model_from_test_model.pkl-0.bin','rb') as f:
----> 2     clf = pickle.load(f)

UnpicklingError: invalid load key, '\x00'.

Is it a problem with xgboost, or am I doing something wrong?

hcho3 · January 21, 2021, 2:13am

In the first method, you should not use the file extension .json, as 0.90 doesn’t support JSON format. Instead, use:

booster.save_model('test_model.bin')

Then from 1.2.1 env, use:

booster = xgb.Booster()
booster.load_model('test_model.bin')

tristan · January 21, 2021, 5:58pm

Thank you @hcho3 , this resolves the problem. I have also confirmed that the booster in version 1.2.1 produces the same predictions as the one in 0.90. This was a big help!

anurag961 · April 8, 2022, 11:51am

Thanks a lot @hcho3. Worked for me as well.