Additional memory optimization features · verl-project/verl#144

(6 评论) (3 反应) (0 负责人)Python (3,940 fork)auto 404

call for contributionenhancementgood first issue

仓库指标

Star: (21,533 star)
PR 合并指标: (平均合并 3天 6小时) (30 天内合并 125 个 PR)

描述

Activation offloading (see implementation here)
Fusing optimizer step into backward pass (see implementation here)
Utilize full_shard reshard_after_forward (see here). I wasn't 100% sure if I could see this already implemented in veRL.

These optimizations largely trade off decreased peak memory useage for additional compute, so may only be useful for training larger models, and in GPU-constrained settings.

贡献者指南

研究方向: 调查激活卸载、将优化器步骤融合到反向传播以及full shard reshard after forward的参考实现。确定哪些尚未在veRL中实现并规划集成。
技术栈: pythonpytorch
领域: backendai
议题类型: 功能
难度: 3
预计时间: 半天
活动状态: 活跃
清晰度: 基本清晰
前置要求: PythonPyTorch
新手友好度: 30

仓库指标

描述

贡献者指南

每天在邮箱收到新鲜 Easy issues。