call for contributiondocumentationgood first issue
Description
Motivation
There are many issues related to OOM, e.g. #328 . We might need a clear guide about how to resolve OOM.
Plan
A non-exclusive enumeration about related configurations:
- Rollout:
gpu_memory_utilization - Other Inference:
- Liger Kernel
*_max_len_per_gpu/micro_batch_size_per_gpu
- Training:
- Liger Kernel
- Ulysses Sequence Parallelism
- gradient checkpointing
- offload
TODO
- Complete the list of related configurations
- Benchmark the effect & overhead of each configuration