Model compatibility : XGBoost4J-Spark and XGBoost python


#1

I wanted to know if the below is possible ?

Can I train my model using sklearn XGboost -->save my model–> load the saved model in spark–> predict using XGboost in mllib

I can thing of one obvious issue of the difference in data representation in spark (vector assembled format) being an issue. Is there a way to overcome this ?


#2

Handling of missing value can be quite tricky. We have a tutorial: https://xgboost.readthedocs.io/en/latest/jvm/xgboost4j_spark_tutorial.html#dealing-with-missing-values