Lightning-AI/pytorch-lightning
GitHub で見るUpdate deepspeed activation checkpointing docs
Open
#17,621 opened on 2023年5月12日
docshelp wantedstrategy: deepspeed
説明
📚 Documentation
In your documentation, you refer to the function deepspeed.checkpointing.checkpoint, but it looks like it does not exist (anymore?). Can you update that section?
While we're at it, can you provide a more common use-case as an example? The guide warns against wrapping an entire model, but having a pretrained language model from transformers is probably the most common use case. What if someone is just using GPT-2 or T5 with no further mathematical layers? What should get wrapped then?
cc @borda @awaelchli