Xgboost 0.90.
Scala Library.
EMR 5.23.0
Given an arbitrary cluster (Lets say 4 node, 32 cores each.)
I’m noticing that no matter what I set for these variables (for example):
spark.executor.cores (32)
spark.executor.instances (4)
spark.task.cpus (1)
num_workers (64)
nthreads (1)
I see Rabbit start with --num_workers=64, I see “INFO @tracker All of 64 nodes getting started”
But looking at Spark I only see one or two active tasks, and Ganglia shows only a couple active cores.
I’ve tried switching around num_workers(4) and nthreads(16), but get the same end result (low activity)
I’ve tried various dataset sizes (from 100MB to 10GB), and various hyperparameters.
I’m expecting that some combination of parameters is going to light up more cores, but I can’t seem to get there. Are my expectations wrong?