Scala trained booster loaded into Python give different predictions

We trained our xgboost model in Scala API, saved the model. Then loaded into Python API, when scoring on the exact same record. The two predicted probability have a huge gap. Scala gives 0.005228, while Python gives 0.01544636. I thought its caused by the index difference (Python starts from 0 while Scala starts from 1), so I inserted a empty column at the beginning of my input data and scored again, still not match. Did anyone else have this issue? Can someone help take a look? Thanks in advance and appreciate your response!

I saw a similar thread on GitHub:

Best,
Wei

Does your data have missing values? If so, how are they represented?

No missings. Numerical missing was already imputed as -999999. Categorical missing was first imputed as “999999”, then did StringIndexer and OneHotEcoder. The data preparation pipeline is in Scala, ONLY final prepared data was converted to Python DMatrix.

Thanks Philip! You are right about Missing value representation. I manually scored a record. The difference is mainly caused by the different interpretations of missing from Scala API and Python API… Still investigating into this…

Best,
Wei

Yes, missing values can cause a lot of headaches. There is a proposed tutorial to clarify how to handle missing values: https://github.com/dmlc/xgboost/pull/4425. Can you look at it and see if it helps? Feel free to leave feedback to the proposed tutorial.