verl-project/verl
View on GitHub[Bug][CI] FDSP2 test in `model_rmpad` job seems unstable
Open
#1388 opened on May 4, 2025
bugcall for contributiongood first issue
Description
Motivation
https://github.com/volcengine/verl/actions/workflows/model.yml shows that:
- the FDSP2 test in
model_rmpadworkflow fails sometimes; - but can also pass sometimes.
Plan
- Find a setup that can reproduce the error steadily (possibly using the test container)
- Locate the root cause
- Fix the bug
Additional Info.
- Related PR: https://github.com/volcengine/verl/pull/1026
- cc: @lxg2015 @PeterSH6