facebookresearch/fairseq
在 GitHub 查看Invalid suffix of raw dataset when preprocessing without language
Open
#1,426 建立於 2019年11月25日
bughelp wanted
倉庫指標
- Star
- (29,107 star)
- PR 合併指標
- (30 天內沒有已合併 PR)
描述
When preprocessing using --dataset-impl raw and no source and target languages are specified, the datasets are stored under train.None-None due to this line:
https://github.com/pytorch/fairseq/blob/5349052aae4ec1350822c894fbb6be350dff61a0/preprocess.py#L218
Is this expected behavior or can we remove this suffix?