How to debug slow training

My xgb is running much slower on Linux than on Windows. How to debug what caused the speed difference?

I am using xgboost 1.4.2

Maybe use a profiler? Check the parameters and hardware specifications?