Newer version has a worse result

I trained model using python xg offline and using biz.k11i:xgboost-predictor:0.2.1 to load model file online to predict.

result like below:

Is it version-mismatch making this happen?
Or any other reasons?

PS: In python 0.62 , all model I trained, a good offline auc will lead to a good online ctr

It’s hard to tell why, since you are getting good AUC on validation dataset. Can you raise regularization parameters to improve generalization?

Also look at the model dumps. Sometimes looking at actual splits can give useful insight