Distributed XGBoost with PySpark on Kubernetes Cluster

Hello Folks,
I was wondering if Distributed XGBoost with PySpark could be run on a Kubernetes cluster. I’m curious because there aren’t many resources for Kubernetes and the XGBoost document is mostly about yarn cluster manager. Do I Really Need to Install XGBoost Operator on the Kubernetes Cluster? I don’t think I fully understand the differences between Distributed XGBoost on Kubernetes and Distributed XGBoost using PySpark. I’m hoping that by just configuring the spark cluster correctly—using spark.master, for example—I can run Distributed XGBoost with PySpark on a Kubernetes cluster.


Hello Folks. I was sucessfull in running distributed xgboost-pyspark on Kubernetes cluster.