[jvm-packages] xgboost 0.82 spark transform problems

Ethos · April 12, 2019, 3:16pm

Hello,

Switched from xgboost 0.81 to 0.82 utilizing spark. I am running a transform on a trained XGBoostClassificationModel (BinaryClassification). testDF is transformed using a pipelineModel just before the XGBoostClassificationModel transform.

Previously in xgboost 0.81, I could run ‘xgboostModel.transform(testDF)’ on a non-persisted Dataset object and receive proper results.
Whether I persist testDF or not, I still receive an AUC ~ 0.9

Now within xgboost 0.82,
If I perist testDF, I receive an AUC ~ 0.9
If I do not persist testDF, I receive an AUC ~ 0.5 (my decile capture rate is flat)

Thoughts?

hcho3 · April 16, 2019, 8:42pm

@CodingCat Do we need to persist Dataset objects to make predictions?