Cannot find XGBoostClassifier


#1

Hello,
I have installed xgboost as per documentation(passed all the tests - there were some warnings).
I now am trying to import it in my scala project with
import ml.dmlc.xgboost4j.scala.spark.XGBoostClassifier

But i get the following error
Exception in thread “main” java.lang.NoClassDefFoundError:
ml/dmlc/xgboost4j/scala/spark/XGBoostClassifier

Environment
Ubuntu 18.04
Apache Maven 3.5
Java 8
Cmake 3.10
Spark 2.3
Scala 2.11

My build.sbt looks like this
name := “NewXGBoost”
version := “1.0”
scalaVersion := “2.11.8”
resolvers += “Local Maven Repository” at “file://h/a/.m2/repository”
libraryDependencies += “org.apache.spark” %% “spark-core” % “2.3.2”
libraryDependencies += “org.apache.spark” %% “spark-sql” % “2.3.2”
libraryDependencies += “org.apache.spark” %% “spark-mllib” % “2.3.2”
libraryDependencies += “org.apache.spark” %% “spark-yarn” % “2.3.2”
libraryDependencies += “ml.dmlc” % “xgboost4j-spark” % “0.81”
libraryDependencies += “ml.dmlc” % “xgboost4j” % “0.81”
ivyScala := ivyScala.value map { _.copy(overrideScalaVersion = true)}


#2

Can you use the Maven Central repository?


#3

You mean as a resolver ?
That is supposed to be included.
Anyway i tried changing to that but still getting the same error.


#4

I tried compiling a small program myself using the following SBT config and could not re-produce your problem:

name := "xgboost_test"

version := "0.1"

scalaVersion := "2.11.12"

libraryDependencies += "org.apache.spark" %% "spark-core" % "2.3.1"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.3.1"
libraryDependencies += "org.apache.spark" %% "spark-mllib" % "2.3.1"
libraryDependencies += "org.apache.spark" %% "spark-yarn" % "2.3.1"
libraryDependencies += "ml.dmlc" % "xgboost4j-spark" % "0.81"
libraryDependencies += "ml.dmlc" % "xgboost4j" % "0.81"

Sample program

package example

import ml.dmlc.xgboost4j.scala.spark.XGBoostClassifier

object Main extends App {
  val xgbParam = Map("eta" -> 0.1f,
    "max_depth" -> 2,
    "objective" -> "multi:softprob",
    "num_class" -> 3,
    "num_round" -> 100,
    "num_workers" -> 2)
  val xgbClassifier = new XGBoostClassifier(xgbParam)
}

The sample program compiled without any issue.

Make sure to choose the versions of the library dependencies that support your current Scala version.


#6

Mine compiles too but when i try to run it with spark-submit the error occurs.
It’s the first time i try to run a scala project on spark so i might have missed something.
Can you tell me the steps you took to run it?
I can confirm though that the project runs successfully when i do not use xgboost.


#7

When you package your application in a JAR file, you should include all dependencies in it, including XGBoost. See http://queirozf.com/entries/creating-scala-fat-jars-for-spark-on-sbt-with-sbt-assembly-plugin. Also you can search “sbt fat jar” on Google.