XGboost Spark network friendliness


I’m running XGboost on Spark using the newly created pyspark wrapper for 0.90. The host it’s deployed on has several hostnames and network interfaces, tracker.py seems to pick one hostname but not the one that’s used for communicating between the driver and the executors.

One option to fix this is to use the spark.driver.host setting if it has been set. A xgboost-tracker.properties file isn’t very useable because we deploy the application on different hosts.


@chenqin Any good way to honor spark.driver.host setting?


we might just add properties file and set host-ip value


@chenqin Can we use spark.driver.host instead? See the OP’s quote above.


Yes, we could honor this setting with code change. But it might not be available in release_0.9
cc @CodingCat