Lightning-AI/pytorch-lightning
View on GitHubUpdate deepspeed activation checkpointing docs
Open
#17621 opened on May 12, 2023
docshelp wantedstrategy: deepspeed
Description
📚 Documentation
In your documentation, you refer to the function deepspeed.checkpointing.checkpoint, but it looks like it does not exist (anymore?). Can you update that section?
While we're at it, can you provide a more common use-case as an example? The guide warns against wrapping an entire model, but having a pretrained language model from transformers is probably the most common use case. What if someone is just using GPT-2 or T5 with no further mathematical layers? What should get wrapped then?
cc @borda @awaelchli