distributeddocumentationgood first issue
Description
I have the same hardware envs, same network, but I could not get the result as you, almost half as you. Any best practices and experience? thanks very much! for bytePS with 1 instance and 8 GPU, I have similar testing result.