XGBoost4J PySpark with Python Examples



I have noticed there are no pyspark examples for how to use XGBoost4J.
Is someone can assist with providing one example of the full pipeline? I.e. with VectorAssembler (or DMatrix?), with String Indexer, OHE, or other methods.

Now that you are 1.0.0 I thought maybe someone can help with that, as it would be great to get a new version and fully migrate on Python from now on.

Looking forward!

Thank you,


@hcho3 can you assist? :slight_smile:


@hcho3 do you think this be relevant? or should I close it for now? thx


I don’t think PySpark API made it to 1.0.


Thank you for the reply.