key_padding_mask is not used in transformer decoder layer? · facebookresearch/fairseq#537

(1 comment) (0 reactions) (0 assignees)Python (6,224 forks)batch import

bughelp wanted

Repository metrics

Stars: (29,107 stars)
PR merge metrics: (30d に merged PR はありません)

説明

when reading the source code i found that key_padding_mask is not used when calculating self attention. There is no problem when the target is padded right by default because attn_mask could do the same thing. But how about left padding on the target?

コントリビューターガイド

調査方針: transformerデコーダレイヤのソースコードを調査し、key padding maskが自己注意で無視されているか確認し、左パディングターゲットを考慮して修正を提案してください。
技術スタック: python
領域: backendmachine learning
Issue 種別: バグ
難度: 3
推定時間: 1-3時間
活動状況: アクティブ
明確さ: 調査が必要
前提条件: PythonPyTorchTransformer architecture
初心者向け度: 45

Repository metrics

説明

コントリビューターガイド

新着 Easy issues をメールで受け取る。