facebookresearch/fairseq

Add `--validate-after-epochs` training flag

Open

#5.496 geöffnet am 3. Mai 2024

Auf GitHub ansehen
 (0 Kommentare) (0 Reaktionen) (0 zugewiesene Personen)Python (6.224 Forks)batch import
enhancementhelp wantedneeds triage

Repository-Metriken

Stars
 (29.107 Stars)
PR-Merge-Metriken
 (Keine gemergten PRs in 30 T)

Beschreibung

🚀 Feature Request

Add a --validate-after-epochs training flag that is a companion flag to --validate-after-updates.

Note: I already have a PR for this ready that I can contribute if this gets approved.

Motivation

When your task is configured to run validation after each epoch, --validate-after-updates can be difficult to use, since you might not know how many updates are in an epoch. This would add a companion flag that allows you to delay validation until N epochs have passed, without having to know in advance how many batches are included in a single epoch.

There is already precedent to have parallel flags for epoch-based and update-based validation (e.g., --validate-interval vs --validate-interval-updates), so it seems like this wouldn't be an unusual addition.

Pitch

Add a --validate-after-epochs flag to configs.py

https://github.com/facebookresearch/fairseq/blob/bedb259bf34a9fc22073c13a1cee23192fa70ef3/fairseq/dataclass/configs.py#L521-L529

and to fairseq_cli/train.py

https://github.com/facebookresearch/fairseq/blob/bedb259bf34a9fc22073c13a1cee23192fa70ef3/fairseq_cli/train.py#L418-L430

Alternatives

The work around to this is to just do so estimation on how many batches are in an epoch, or to start a task, let it run for one update so that you can see batches-per-epoch, then start it over with the correct value set.

Additional context

I already have a PR prepared for this (it's like a 5 line change), but my understanding is that things like this need to be approved via issues first.

Contributor Guide