Lightning-AI/pytorch-lightning

Update deepspeed activation checkpointing docs

Open

#17621 opened on May 12, 2023

View on GitHub
 (1 comment) (0 reactions) (0 assignees)Python (26,687 stars) (3,233 forks)batch import
docshelp wantedstrategy: deepspeed

Description

📚 Documentation

In your documentation, you refer to the function deepspeed.checkpointing.checkpoint, but it looks like it does not exist (anymore?). Can you update that section?

While we're at it, can you provide a more common use-case as an example? The guide warns against wrapping an entire model, but having a pretrained language model from transformers is probably the most common use case. What if someone is just using GPT-2 or T5 with no further mathematical layers? What should get wrapped then?

cc @borda @awaelchli

Contributor guide