Hi team,
I am trying to train rank:pairwise
model using Scala Spark. The version I am importing is
"spark.jars.packages": "ml.dmlc:xgboost4j_2.12:1.5.2,ml.dmlc:xgboost4j-spark_2.12:1.5.2"
I have a dataset which has 3 columns: features, label (int) and group (int, staring from 1). I create the following model:
val xgbParam = Map(
"max_depth" -> 2,
"objective" -> "rank:pairwise",
"num_round" -> 100,
"group_col" -> "group")
val xgbRanker = new XGBoostRegressor(xgbParam)
Once I start training with val xgbRankerModel = xgbRanker.fit(xgbInput)
I receive a very generic error message:
An error was encountered:
ml.dmlc.xgboost4j.java.XGBoostError: XGBoostModel training failed
at ml.dmlc.xgboost4j.scala.spark.XGBoost$.postTrackerReturnProcessing(XGBoost.scala:750)
at ml.dmlc.xgboost4j.scala.spark.XGBoost$.trainDistributed(XGBoost.scala:624)
at ml.dmlc.xgboost4j.scala.spark.XGBoostRegressor.train(XGBoostRegressor.scala:196)
at ml.dmlc.xgboost4j.scala.spark.XGBoostRegressor.train(XGBoostRegressor.scala:44)
at org.apache.spark.ml.Predictor.fit(Predictor.scala:150)
... 53 elided
I am not even sure what to look for… I don’t think that the problem is with the data, because I was able to train XGBoost model via Scikit learn on my laptop. Is it any parameter I am missing? Is there a way to get a more specific error message? Any ideas appreciated.