Xgboos4j-spark 0.82 the result of XGBoostClassificationModel.transform is wrong?

when i use Xgboos4j-spark 0.82 on spark 2.3.4 to train a model, everything is ok. but, when i use the trained model to transfom the input data(train, test, valid) and calculate the auc score, the auc is always 0.5 around, it’s very different from training info of train-auc, test-auc, valid-auc.
so, i check the code several days, but i don’t find bug here.

today, i just change the xgboost4j-spark version to 0.90, and use the same code! the result of XGBoostClassificationModel.transform looks like correct.

so, i wan to know Xgboos4j-spark 0.82 the result of XGBoostClassificationModel.transform is some problem here?

the info of 0.82:
2020-08-22,16:07:35,928 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:07:35,927 INFO [8] train-auc:0.597014 valid-auc:0.554890 test-auc:0.587551
2020-08-22,16:08:02,810 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:08:02,810 INFO [9] train-auc:0.597493 valid-auc:0.554818 test-auc:0.587872
2020-08-22,16:08:13,370 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:08:13,370 INFO [10] train-auc:0.597904 valid-auc:0.555146 test-auc:0.587875
2020-08-22,16:08:23,823 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:08:23,823 INFO [11] train-auc:0.598719 valid-auc:0.555474 test-auc:0.588002
2020-08-22,16:08:35,661 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:08:35,661 INFO [12] train-auc:0.598844 valid-auc:0.555384 test-auc:0.587852
2020-08-22,16:08:48,956 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:08:48,956 INFO [13] train-auc:0.599195 valid-auc:0.555375 test-auc:0.588057
2020-08-22,16:09:02,634 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:09:02,634 INFO [14] train-auc:0.599650 valid-auc:0.555745 test-auc:0.588276
2020-08-22,16:09:14,602 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:09:14,602 INFO [15] train-auc:0.600395 valid-auc:0.555560 test-auc:0.588648
2020-08-22,16:09:25,110 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:09:25,110 INFO [16] train-auc:0.601176 valid-auc:0.555643 test-auc:0.588263
2020-08-22,16:09:39,970 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:09:39,969 INFO [17] train-auc:0.601873 valid-auc:0.555584 test-auc:0.588772
2020-08-22,16:09:53,999 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:09:53,998 INFO [18] train-auc:0.603026 valid-auc:0.555996 test-auc:0.588921
2020-08-22,16:10:07,873 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:10:07,873 INFO [19] train-auc:0.607575 valid-auc:0.555680 test-auc:0.593693
2020-08-22,16:10:22,549 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:10:22,549 INFO [20] train-auc:0.608668 valid-auc:0.555913 test-auc:0.593600
2020-08-22,16:10:41,115 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:10:41,115 INFO [21] train-auc:0.609931 valid-auc:0.556067 test-auc:0.593749
2020-08-22,16:10:58,577 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:10:58,577 INFO [22] train-auc:0.610266 valid-auc:0.556226 test-auc:0.593531
2020-08-22,16:11:15,634 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:11:15,633 INFO [23] train-auc:0.611028 valid-auc:0.556253 test-auc:0.593873
2020-08-22,16:11:32,015 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:11:32,015 INFO [24] train-auc:0.612297 valid-auc:0.555979 test-auc:0.594324
2020-08-22,16:11:44,885 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:11:44,885 INFO [25] train-auc:0.613781 valid-auc:0.556325 test-auc:0.594392
2020-08-22,16:11:58,745 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:11:58,744 INFO [26] train-auc:0.615992 valid-auc:0.557589 test-auc:0.596537
2020-08-22,16:12:12,745 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:12:12,745 INFO [27] train-auc:0.616854 valid-auc:0.557578 test-auc:0.596565
2020-08-22,16:12:21,790 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:12:21,789 INFO [28] train-auc:0.617657 valid-auc:0.557884 test-auc:0.597109
2020-08-22,16:12:34,421 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:12:34,421 INFO [29] train-auc:0.618575 valid-auc:0.557470 test-auc:0.597040
2020-08-22,16:12:48,071 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:12:48,071 INFO [30] train-auc:0.618673 valid-auc:0.557727 test-auc:0.597132
2020-08-22,16:13:05,914 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:13:05,914 INFO [31] train-auc:0.619851 valid-auc:0.557798 test-auc:0.596928
2020-08-22,16:13:20,954 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:13:20,954 INFO [32] train-auc:0.621318 valid-auc:0.559248 test-auc:0.597635
2020-08-22,16:13:39,769 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:13:39,769 INFO [33] train-auc:0.622631 valid-auc:0.559736 test-auc:0.597858
2020-08-22,16:13:57,424 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:13:57,424 INFO [34] train-auc:0.623035 valid-auc:0.559774 test-auc:0.598100
2020-08-22,16:14:10,510 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:14:10,510 INFO [35] train-auc:0.623462 valid-auc:0.560828 test-auc:0.598551
2020-08-22,16:14:26,956 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:14:26,955 INFO [36] train-auc:0.624325 valid-auc:0.561035 test-auc:0.597981
2020-08-22,16:14:38,494 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:14:38,494 INFO [37] train-auc:0.624980 valid-auc:0.559986 test-auc:0.597935
2020-08-22,16:14:58,080 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:14:58,080 INFO [38] train-auc:0.625153 valid-auc:0.559669 test-auc:0.598288
2020-08-22,16:15:15,193 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:15:15,193 INFO [39] train-auc:0.625808 valid-auc:0.559545 test-auc:0.598260
2020-08-22,16:15:30,699 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:15:30,699 INFO [40] train-auc:0.626553 valid-auc:0.560989 test-auc:0.598622
2020-08-22,16:15:43,830 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:15:43,830 INFO [41] train-auc:0.627961 valid-auc:0.562062 test-auc:0.598431
2020-08-22,16:15:56,020 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:15:56,020 INFO [42] train-auc:0.628303 valid-auc:0.562239 test-auc:0.598815
2020-08-22,16:16:10,023 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:16:10,023 INFO [43] train-auc:0.629166 valid-auc:0.562317 test-auc:0.598854
2020-08-22,16:16:22,944 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:16:22,944 INFO [44] train-auc:0.629351 valid-auc:0.562934 test-auc:0.598625
2020-08-22,16:16:38,829 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:16:38,829 INFO [45] train-auc:0.630158 valid-auc:0.562977 test-auc:0.598779
2020-08-22,16:16:56,884 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:16:56,883 INFO [46] train-auc:0.631138 valid-auc:0.562282 test-auc:0.599364
2020-08-22,16:17:07,620 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:17:07,620 INFO [47] train-auc:0.632674 valid-auc:0.562058 test-auc:0.598253
2020-08-22,16:17:16,831 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:17:16,831 INFO [48] train-auc:0.632971 valid-auc:0.562403 test-auc:0.598182
2020-08-22,16:17:30,046 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:17:30,046 INFO [49] train-auc:0.633431 valid-auc:0.562200 test-auc:0.598020
2020-08-22,16:17:41,057 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:17:41,056 INFO [50] train-auc:0.634718 valid-auc:0.562337 test-auc:0.597786
2020-08-22,16:17:54,468 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:17:54,468 INFO [51] train-auc:0.635966 valid-auc:0.562429 test-auc:0.597751
2020-08-22,16:18:05,339 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:18:05,339 INFO [52] train-auc:0.636038 valid-auc:0.562572 test-auc:0.597960
2020-08-22,16:18:15,777 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:18:15,777 INFO [53] train-auc:0.637040 valid-auc:0.562320 test-auc:0.598783
2020-08-22,16:18:32,575 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:18:32,575 INFO [54] train-auc:0.637844 valid-auc:0.561905 test-auc:0.598809
2020-08-22,16:18:51,784 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:18:51,784 INFO [55] train-auc:0.638427 valid-auc:0.561960 test-auc:0.598826
2020-08-22,16:19:04,392 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:19:04,391 INFO [56] train-auc:0.639144 valid-auc:0.559984 test-auc:0.599157
2020-08-22,16:19:15,682 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:19:15,682 INFO [57] train-auc:0.639670 valid-auc:0.559830 test-auc:0.599295
2020-08-22,16:19:28,662 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:19:28,661 INFO [58] train-auc:0.640365 valid-auc:0.559930 test-auc:0.599200
2020-08-22,16:19:41,533 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:19:41,533 INFO [59] train-auc:0.641447 valid-auc:0.558609 test-auc:0.598957
2020-08-22,16:19:58,479 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:19:58,478 INFO [60] train-auc:0.641954 valid-auc:0.558731 test-auc:0.598890
2020-08-22,16:20:10,898 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:20:10,897 INFO [61] train-auc:0.642610 valid-auc:0.558436 test-auc:0.598655
2020-08-22,16:20:24,811 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:20:24,811 INFO [62] train-auc:0.643498 valid-auc:0.558762 test-auc:0.598502
2020-08-22,16:20:42,101 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:20:42,101 INFO [63] train-auc:0.643565 valid-auc:0.559078 test-auc:0.598740
2020-08-22,16:20:57,203 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:20:57,203 INFO [64] train-auc:0.643753 valid-auc:0.559524 test-auc:0.599173
2020-08-22,16:21:09,430 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 16:21:09,430 INFO [65] train-auc:0.644741 valid-auc:0.559035 test-auc:0.598720

the auc calculated after trained model transform:

2020-08-22,16:22:24,623 INFO com.models.BatchCreditSTModel$: valid data auc score: 0.5004046649551552 .

the info of 0.90:

2020-08-22,12:27:15,643 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:27:15,642 INFO [0] train-auc:0.587858 valid-auc:0.563347 test-auc:0.583881
2020-08-22,12:27:35,331 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:27:35,330 INFO [1] train-auc:0.589246 valid-auc:0.563601 test-auc:0.584211
2020-08-22,12:27:57,168 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:27:57,168 INFO [2] train-auc:0.589815 valid-auc:0.563614 test-auc:0.584256
2020-08-22,12:28:13,256 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:28:13,256 INFO [3] train-auc:0.589961 valid-auc:0.563595 test-auc:0.584159
2020-08-22,12:28:50,258 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:28:50,258 INFO [4] train-auc:0.590293 valid-auc:0.563452 test-auc:0.584064
2020-08-22,12:29:08,444 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:29:08,444 INFO [5] train-auc:0.590444 valid-auc:0.563586 test-auc:0.584085
2020-08-22,12:29:22,473 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:29:22,470 INFO [6] train-auc:0.596127 valid-auc:0.555383 test-auc:0.587150
2020-08-22,12:29:38,992 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:29:38,991 INFO [7] train-auc:0.596444 valid-auc:0.555119 test-auc:0.587142
2020-08-22,12:29:54,726 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:29:54,726 INFO [8] train-auc:0.597010 valid-auc:0.555094 test-auc:0.587388
2020-08-22,12:30:06,983 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:30:06,982 INFO [9] train-auc:0.597929 valid-auc:0.555530 test-auc:0.587864
2020-08-22,12:30:21,524 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:30:21,524 INFO [10] train-auc:0.598425 valid-auc:0.555429 test-auc:0.587706
2020-08-22,12:30:39,700 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:30:39,700 INFO [11] train-auc:0.598829 valid-auc:0.555660 test-auc:0.587842
2020-08-22,12:31:00,788 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:31:00,788 INFO [12] train-auc:0.598988 valid-auc:0.555196 test-auc:0.588081
2020-08-22,12:31:21,200 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:31:21,199 INFO [13] train-auc:0.603109 valid-auc:0.554359 test-auc:0.592457
2020-08-22,12:31:39,745 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:31:39,744 INFO [14] train-auc:0.603969 valid-auc:0.555163 test-auc:0.592843
2020-08-22,12:31:58,740 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:31:58,738 INFO [15] train-auc:0.606048 valid-auc:0.555981 test-auc:0.594441
2020-08-22,12:32:18,026 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:32:18,025 INFO [16] train-auc:0.606694 valid-auc:0.556703 test-auc:0.594126
2020-08-22,12:32:40,800 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:32:40,799 INFO [17] train-auc:0.607161 valid-auc:0.556776 test-auc:0.594243
2020-08-22,12:33:03,691 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:33:03,690 INFO [18] train-auc:0.608676 valid-auc:0.557109 test-auc:0.594217
2020-08-22,12:33:24,086 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:33:24,085 INFO [19] train-auc:0.609313 valid-auc:0.556555 test-auc:0.594426
2020-08-22,12:33:45,039 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:33:45,039 INFO [20] train-auc:0.610927 valid-auc:0.557013 test-auc:0.595739
2020-08-22,12:34:03,556 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:34:03,555 INFO [21] train-auc:0.611336 valid-auc:0.557409 test-auc:0.595687
2020-08-22,12:34:18,826 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:34:18,826 INFO [22] train-auc:0.613121 valid-auc:0.557211 test-auc:0.595803
2020-08-22,12:34:52,080 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:34:52,080 INFO [23] train-auc:0.615736 valid-auc:0.556396 test-auc:0.597413
2020-08-22,12:35:13,123 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:35:13,122 INFO [24] train-auc:0.616165 valid-auc:0.556764 test-auc:0.596999
2020-08-22,12:35:31,454 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:35:31,454 INFO [25] train-auc:0.617188 valid-auc:0.556757 test-auc:0.596608
2020-08-22,12:35:48,423 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:35:48,423 INFO [26] train-auc:0.618175 valid-auc:0.556591 test-auc:0.596192
2020-08-22,12:36:03,881 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:36:03,880 INFO [27] train-auc:0.618687 valid-auc:0.556462 test-auc:0.596145
2020-08-22,12:36:20,094 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:36:20,093 INFO [28] train-auc:0.619176 valid-auc:0.556370 test-auc:0.596455
2020-08-22,12:36:38,423 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:36:38,422 INFO [29] train-auc:0.620327 valid-auc:0.556236 test-auc:0.596416
2020-08-22,12:36:52,422 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:36:52,421 INFO [30] train-auc:0.620697 valid-auc:0.556320 test-auc:0.597037
2020-08-22,12:37:12,954 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:37:12,953 INFO [31] train-auc:0.621804 valid-auc:0.556194 test-auc:0.597037
2020-08-22,12:37:28,726 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:37:28,726 INFO [32] train-auc:0.622729 valid-auc:0.556171 test-auc:0.596716
2020-08-22,12:37:42,697 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:37:42,697 INFO [33] train-auc:0.623784 valid-auc:0.556914 test-auc:0.597087
2020-08-22,12:37:55,767 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:37:55,766 INFO [34] train-auc:0.624788 valid-auc:0.557169 test-auc:0.597749
2020-08-22,12:38:09,876 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:38:09,875 INFO [35] train-auc:0.625109 valid-auc:0.557100 test-auc:0.597662
2020-08-22,12:38:23,255 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:38:23,254 INFO [36] train-auc:0.625718 valid-auc:0.557292 test-auc:0.597851
2020-08-22,12:38:37,856 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:38:37,856 INFO [37] train-auc:0.626427 valid-auc:0.557288 test-auc:0.598060
2020-08-22,12:38:49,743 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:38:49,743 INFO [38] train-auc:0.627314 valid-auc:0.557346 test-auc:0.598508
2020-08-22,12:39:25,222 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:39:25,221 INFO [39] train-auc:0.628434 valid-auc:0.557226 test-auc:0.598252
2020-08-22,12:39:42,719 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:39:42,718 INFO [40] train-auc:0.629159 valid-auc:0.557345 test-auc:0.598550
2020-08-22,12:40:00,383 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:40:00,383 INFO [41] train-auc:0.629178 valid-auc:0.557428 test-auc:0.598627
2020-08-22,12:40:17,786 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:40:17,785 INFO [42] train-auc:0.629926 valid-auc:0.558127 test-auc:0.598527
2020-08-22,12:40:33,602 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:40:33,602 INFO [43] train-auc:0.630701 valid-auc:0.558428 test-auc:0.598760
2020-08-22,12:40:57,237 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:40:57,236 INFO [44] train-auc:0.631243 valid-auc:0.558598 test-auc:0.598665
2020-08-22,12:41:23,712 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:41:23,711 INFO [45] train-auc:0.632094 valid-auc:0.558344 test-auc:0.599255
2020-08-22,12:41:38,883 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:41:38,882 INFO [46] train-auc:0.632912 valid-auc:0.558280 test-auc:0.598531
2020-08-22,12:41:56,423 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:41:56,422 INFO [47] train-auc:0.633668 valid-auc:0.558462 test-auc:0.598541
2020-08-22,12:42:14,686 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:42:14,686 INFO [48] train-auc:0.634608 valid-auc:0.558700 test-auc:0.598264
2020-08-22,12:42:31,771 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:42:31,770 INFO [49] train-auc:0.634732 valid-auc:0.558871 test-auc:0.597919
2020-08-22,12:42:50,336 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:42:50,335 INFO [50] train-auc:0.635912 valid-auc:0.558630 test-auc:0.597653
2020-08-22,12:43:12,302 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:43:12,302 INFO [51] train-auc:0.636777 valid-auc:0.558455 test-auc:0.597235
2020-08-22,12:43:29,184 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:43:29,184 INFO [52] train-auc:0.637085 valid-auc:0.558815 test-auc:0.597618
2020-08-22,12:43:43,896 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:43:43,895 INFO [53] train-auc:0.637482 valid-auc:0.558817 test-auc:0.597862
2020-08-22,12:43:56,845 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:43:56,845 INFO [54] train-auc:0.638000 valid-auc:0.558959 test-auc:0.597780
2020-08-22,12:44:09,664 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:44:09,664 INFO [55] train-auc:0.638757 valid-auc:0.558441 test-auc:0.597527
2020-08-22,12:44:24,718 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:44:24,717 INFO [56] train-auc:0.639959 valid-auc:0.558509 test-auc:0.597613
2020-08-22,12:44:43,717 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:44:43,717 INFO [57] train-auc:0.640250 valid-auc:0.558603 test-auc:0.597962
2020-08-22,12:44:59,044 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:44:59,044 INFO [58] train-auc:0.640941 valid-auc:0.558649 test-auc:0.598030
2020-08-22,12:45:14,301 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:45:14,300 INFO [59] train-auc:0.641656 valid-auc:0.558439 test-auc:0.598226
2020-08-22,12:45:27,702 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:45:27,701 INFO [60] train-auc:0.642388 valid-auc:0.558560 test-auc:0.599216
2020-08-22,12:45:40,749 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:45:40,748 INFO [61] train-auc:0.642805 valid-auc:0.558470 test-auc:0.599429
2020-08-22,12:45:55,697 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:45:55,697 INFO [62] train-auc:0.643274 valid-auc:0.558567 test-auc:0.599465
2020-08-22,12:46:09,130 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:46:09,130 INFO [63] train-auc:0.643849 valid-auc:0.558751 test-auc:0.599461
2020-08-22,12:46:22,465 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:46:22,464 INFO [64] train-auc:0.644598 valid-auc:0.558694 test-auc:0.599202
2020-08-22,12:46:35,596 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:46:35,595 INFO [65] train-auc:0.645108 valid-auc:0.558755 test-auc:0.599554
2020-08-22,12:46:46,469 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:46:46,469 INFO [66] train-auc:0.645343 valid-auc:0.559053 test-auc:0.599696
2020-08-22,12:47:00,758 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:47:00,758 INFO [67] train-auc:0.645836 valid-auc:0.559232 test-auc:0.599676
2020-08-22,12:47:13,521 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:47:13,521 INFO [68] train-auc:0.646619 valid-auc:0.559341 test-auc:0.599275
2020-08-22,12:47:24,281 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:47:24,281 INFO [69] train-auc:0.646727 valid-auc:0.559788 test-auc:0.599396
2020-08-22,12:47:38,354 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:47:38,353 INFO [70] train-auc:0.647423 valid-auc:0.559911 test-auc:0.599252
2020-08-22,12:47:51,930 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:47:51,929 INFO [71] train-auc:0.648126 valid-auc:0.559802 test-auc:0.599190
2020-08-22,12:48:04,761 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:48:04,760 INFO [72] train-auc:0.648975 valid-auc:0.559846 test-auc:0.599191
2020-08-22,12:48:16,417 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:48:16,417 INFO [73] train-auc:0.649772 valid-auc:0.559761 test-auc:0.599150
2020-08-22,12:48:29,861 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:48:29,861 INFO [74] train-auc:0.650477 valid-auc:0.559977 test-auc:0.599354
2020-08-22,12:48:44,523 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:48:44,523 INFO [75] train-auc:0.651397 valid-auc:0.559924 test-auc:0.599570
2020-08-22,12:48:57,026 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:48:57,026 INFO [76] train-auc:0.652194 valid-auc:0.559815 test-auc:0.599463
2020-08-22,12:49:20,291 INFO ml.dmlc.xgboost4j.java.RabitTracker$TrackerProcessLogger: 2020-08-22 12:49:20,290 INFO [77] train-auc:0.652679 valid-auc:0.560032 test-auc:0.600516

the auc calculated after trained model transform:

2020-08-22,12:56:00,030 INFO com.models.BatchCreditSTModel$: valid data auc score: 0.5587335501761648 .


simultaneously, i saved the result of XGBoostClassificationModel.transform on HDFS, and compared the two results about auc score offline.

the version 0.82:
train:0.4993505 valid:0.5015606

the version 0.90:
train:0.6492953 valid:0.5587336


so, i think may something wrong here. And i hope someone can reply as soon as soon as possible. thx!

the las one, when i use 0.90 to save xgb model, always got the error(HAHAHA):
2020-08-22,12:57:43,545 ERROR org.apache.spark.deploy.yarn.ApplicationMaster: User class threw exception: java.lang.NoSuchMethodError: org.json4s.jackson.JsonMethods$.parse$default$3()Z


here is my train code, thx!!!

def base_on_learn(trainRaw: DataFrame, usedFeats: Array[String], validRaw: DataFrame = null, argMap: MutableHashMap[String, String], spark: SparkSession): Unit = {
val labelIdentity: String = argMap.getOrElse(“labelIdentity”, “label_m3”)
val scoreOutputPath: String = argMap.getOrElse(“scoreOutputPath”, “”)
val modelFilePath: String = argMap.getOrElse(“modelFilePath”, “”)
val mainFilePath: String = argMap.getOrElse(“mainFilePath”, “”)
val printValidMetric: Boolean = argMap.getOrElse(“printValidMetric”, “false”).toBoolean

val numRound: Int = argMap.getOrElse("numRound", "100").toInt
val numWorkers: Int = argMap.getOrElse("numWorkers", "8").toInt
val eta: Float = argMap.getOrElse("eta", "0.1f").toFloat
val maxDepth: Int = argMap.getOrElse("max_depth", "3").toInt
val subsample: Float = argMap.getOrElse("subsample", "1.0").toFloat
val colsample_bytree: Float = argMap.getOrElse("colsample", "1.0").toFloat
val silent: Int = argMap.getOrElse("silent", "0").toInt
val objective: String = argMap.getOrElse("objective", "binary:logistic")
val evalMetric: String = argMap.getOrElse("eval_metric", "auc")
val nthread: Int = argMap.getOrElse("nthread", "1").toInt
val trainTestRatio: Double = argMap.getOrElse("trainTestRatio", "0.8").toDouble
val maximize_evaluation_metrics: Boolean = argMap.getOrElse("maximize_evaluation_metrics", "true").toBoolean
val numEarlyStoppingRounds: Int = argMap.getOrElse("numEarlyStoppingRounds", "30").toInt
val seed: Int = argMap.getOrElse("seed", "1").toInt

val params = Map("num_round" -> numRound, "num_workers" -> numWorkers, "eta" -> eta, "max_depth" -> maxDepth,
  "silent" -> silent, "objective" -> objective, "subsample" -> subsample, "colsample_bytree" -> colsample_bytree,
  "eval_metric" -> evalMetric, "nthread" -> nthread, "num_early_stopping_rounds" -> numEarlyStoppingRounds,
  "maximize_evaluation_metrics" -> maximize_evaluation_metrics, "timeout_request_workers" -> 180000L)

val vecAssembler: VectorAssembler = new VectorAssembler().setInputCols(usedFeats).setOutputCol("features")
logger.info("@@@ check the vec assembler input feats sorts!")
logger.info(vecAssembler.getInputCols.mkString(";"))
// train_test_df
val trainTestDf: DataFrame = vecAssembler.transform(trainRaw).withColumnRenamed(labelIdentity, "label").
  select("features", "label",  "date")
// valid_df
val validDf: DataFrame = vecAssembler.transform(validRaw).withColumnRenamed(labelIdentity, "label").
  select("features", "label",  "date")
// train_df test_df
val Array(trainDf, testDf) = trainTestDf.randomSplit(Array(trainTestRatio, 1 - trainTestRatio), seed = seed)


logger.info("Training ....")
// xgb_model
val xgb: XGBoostClassifier = new XGBoostClassifier(params).
  setEvalSets(Map("valid" -> validDf, "test" -> testDf)).
  setFeaturesCol("features").
  setLabelCol("label")
val model: XGBoostClassificationModel = xgb.fit(trainDf)

logger.info("Transform train test data ....")
val extractScore: UserDefinedFunction = udf((x: Vector) => x(1))

val trainTestPredTmp: DataFrame = model.transform(trainTestDf)

val trainTestPred: DataFrame = trainTestPredTmp.withColumn("prob", extractScore(trainTestPredTmp("probability"))).
  select( "prob", "label", "date")

val scores: DataFrame = if (validRaw != null) {
  logger.info("predict valid data ....")
  val validPredTmp: DataFrame = model.transform(validDf)
  val validPred: DataFrame = validPredTmp.
    withColumn("prob", extractScore(validPredTmp("probability"))).
    select( "prob", "label", "date")

  if (printValidMetric) {
    val validPreAndLabel: RDD[(Double, Double)] = validPred.rdd.map((r: Row) => (r.getAs[Double]("prob"), r.getAs[Int]("label").toDouble))
    val validAuc: Double = getAucScore(validPreAndLabel)
    logger.info("<------------------------------------------------------------------------>")
    logger.info(s"valid data auc score: ${validAuc} .")
    logger.info("<------------------------------------------------------------------------>")
  }
  trainTestPred.union(validPred)
} else {
  trainTestPred
}

logger.info("Get feature importance and model description ...")
getFeatureImportanceModelDesc(model, usedFeats, saveParentPath = mainFilePath, spark = spark)


logger.info("Save train test valid score ....")
scores.repartition(argMap.getOrElse("repartition", "1").toInt).write.mode("overwrite").option("header", "true").csv(scoreOutputPath)
logger.info("over ....")


logger.info("Saving the model file to HDFS ....")
model.write.overwrite().save(modelFilePath)

}