Training ranking model using Spark fails with generic error

Sergei · April 5, 2022, 4:35am

Hi team,

I am trying to train rank:pairwise model using Scala Spark. The version I am importing is

"spark.jars.packages": "ml.dmlc:xgboost4j_2.12:1.5.2,ml.dmlc:xgboost4j-spark_2.12:1.5.2"

I have a dataset which has 3 columns: features, label (int) and group (int, staring from 1). I create the following model:

val xgbParam = Map(
   "max_depth" -> 2,
   "objective" -> "rank:pairwise",
   "num_round" -> 100,
   "group_col" -> "group")

val xgbRanker = new XGBoostRegressor(xgbParam)

Once I start training with val xgbRankerModel = xgbRanker.fit(xgbInput) I receive a very generic error message:

An error was encountered:
ml.dmlc.xgboost4j.java.XGBoostError: XGBoostModel training failed
  at ml.dmlc.xgboost4j.scala.spark.XGBoost$.postTrackerReturnProcessing(XGBoost.scala:750)
  at ml.dmlc.xgboost4j.scala.spark.XGBoost$.trainDistributed(XGBoost.scala:624)
  at ml.dmlc.xgboost4j.scala.spark.XGBoostRegressor.train(XGBoostRegressor.scala:196)
  at ml.dmlc.xgboost4j.scala.spark.XGBoostRegressor.train(XGBoostRegressor.scala:44)
  at org.apache.spark.ml.Predictor.fit(Predictor.scala:150)
  ... 53 elided

I am not even sure what to look for… I don’t think that the problem is with the data, because I was able to train XGBoost model via Scikit learn on my laptop. Is it any parameter I am missing? Is there a way to get a more specific error message? Any ideas appreciated.

Sergei · April 5, 2022, 6:07pm

For some reason it works if I downgrade packages to

"spark.jars.packages": "ml.dmlc:xgboost4j_2.12:1.3.1,ml.dmlc:xgboost4j-spark_2.12:1.3.1"

It is still strange, but maybe someone who knows what has changed between 1.3.1 and 1.5.2 can point what might be the culprit (all versions in between didn’t work).