Description of Distributed Training Algorithm

Is there any sort of documentation or description of the algorithm used by XGBoost for distributed training?

I have the suspicion that XGBoost parallelizes computations by deciding how to split a node in a decision tree, but I’m not really sure. Are there any assumptions on how data is distributed to different compute nodes?