Can't run the XGBoost4J-Spark Tutorial


I’m trying to go through the XGBoost4J-Spark tutorial (, but when I reach this line:
val xgbClassificationModel =

I get this error:

20/07/17 13:39:14 ERROR NativeLibLoader: failed to load xgboost4j library from jar
20/07/17 13:39:14 ERROR DMatrix: Failed to load native library File /lib/libxgboost4j.dylib was not found inside JAR.

I’m using sbt with intellij on a macbook pro, and my build.sbt contains:

scalaVersion := "2.12.12"
val sparkVersion = "3.0.0"
val xgboostVersion = "1.2.0-SNAPSHOT"

resolvers += "XGBoost4J Snapshot Repo" at ""

libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % sparkVersion,
  "org.apache.spark" %% "spark-sql" % sparkVersion,
  "org.apache.spark" %% "spark-mllib" % sparkVersion,
  "ml.dmlc" %% "xgboost4j" % xgboostVersion,
  "ml.dmlc" %% "xgboost4j-spark" % xgboostVersion

Interestingly, I’m finding a in the xgboost4j_2.12-1.2.0-20200717.090328-58.jar/lib under the External Libraries of my intellij project. I’ve tried using multiple different release versions including 1.1.1 (with a different resolver), and snapshot versions, but I can’t seem to resolve this. Any insight would be much appreciated!


Are you running MacOS? The Snapshot build only works on Linux. To run the tutorial, try using a Linux machine or cluster.

Yes! I am using MacOS. I do plan to deploy this to AWS EMR which will be in linux, but my dev machine is my macbook pro—is there a way to get XGBoost4J-Spark working on my machine? I’ve also tried the release versions, I’m running into the same problem.

You can build the JAR from the source by running “mvn package” command from the jvm-packages directory.

Thanks for the suggestion! Luckily, I found that if I specify version 1.0.0, this problem seems to be fixed, and I can confirm that the xgboost4j_2.12-1.0.0jar/lib contains both libxgboost4j.dylib and—not sure why the other versions I tried (e.g., 1.1.1) didn’t have these the .dylib file. Anyway, thanks @hcho3!