XGBoost scala vs spark different performance

I have a question about the different performances between XGboost Scala and Spark.

I am training a fraud detection model on data for two banks A and B where bank B is a bit smaller in terms of the number of credit card transactions. This is a classification problem with highly unbalanced classes (1:1000).

I train a model Spark and Scala on the same dataset with the same parameters with data of both banks. Then I calculate model performance:

  • the precision
  • the number of true positives (TP)
  • and the number of false positives (FP)

at the same threshold for the two models (scala and spark) on two test sets for bank A and bank B. I find that:

  • Results for bank A for model spark and scala are nearly identical for precision, TP and FP
  • Results for bank B are worse for scala than for spark. For example, the precision spark is 41% while the precision scala is 29% - this is a significant degradation in the context of my research.
  • For bank B the number of FP in model scala is 50% higher than for model spark at the same threshold
  • The precision of model scala for bank B is lower than spark for all thresholds than the precision of model spark. In fact, it never goes higher than 30% while the precision of the model spark can be as high as 50%.

What could be the problem here? Why is the performance of bank A identical between two models, while for model B model scala is clearly worse than model spark?

I repeated the training and testing several times and always bank B scala is worse than spark.

I am running XGboost 0.7 on scala 2.11.11 and spark 2.2. Unfortunately, I can’t install a higher version of spark or XGBoost.

Context, our production model must be in pure scala but we explore, cross-validate the model first in spark, and then the final best model is re-trained in scala and put in production. Training of model scala on our cluster takes more than 3 hours while training of model spark is about 25 min.