facebookresearch/fairseq

Add `--validate-after-epochs` training flag

Open

#5 496 ouverte le 3 mai 2024

Voir sur GitHub
 (0 commentaires) (0 réactions) (0 assignés)Python (6 224 forks)batch import
enhancementhelp wantedneeds triage

Métriques du dépôt

Stars
 (29 107 stars)
Métriques de merge PR
 (Aucune PR mergée en 30 j)

Description

🚀 Feature Request

Add a --validate-after-epochs training flag that is a companion flag to --validate-after-updates.

Note: I already have a PR for this ready that I can contribute if this gets approved.

Motivation

When your task is configured to run validation after each epoch, --validate-after-updates can be difficult to use, since you might not know how many updates are in an epoch. This would add a companion flag that allows you to delay validation until N epochs have passed, without having to know in advance how many batches are included in a single epoch.

There is already precedent to have parallel flags for epoch-based and update-based validation (e.g., --validate-interval vs --validate-interval-updates), so it seems like this wouldn't be an unusual addition.

Pitch

Add a --validate-after-epochs flag to configs.py

https://github.com/facebookresearch/fairseq/blob/bedb259bf34a9fc22073c13a1cee23192fa70ef3/fairseq/dataclass/configs.py#L521-L529

and to fairseq_cli/train.py

https://github.com/facebookresearch/fairseq/blob/bedb259bf34a9fc22073c13a1cee23192fa70ef3/fairseq_cli/train.py#L418-L430

Alternatives

The work around to this is to just do so estimation on how many batches are in an epoch, or to start a task, let it run for one update so that you can see batches-per-epoch, then start it over with the correct value set.

Additional context

I already have a PR prepared for this (it's like a 5 line change), but my understanding is that things like this need to be approved via issues first.

Guide contributeur