enhancementhelp wanted
Repository metrics
- Stars
- (1,428 stars)
- PR merge metrics
- (30d に merged PR はありません)
説明
Similarly to the ParallelEncoder, a ParallelDecoder setup could allow multi-task learning. This should not be too hard to implement but we need to take care of some details:
- support separate values for the decoding parameters (beam_width, length_penalty, etc.),
- parts of SequenceToSequence assume a single output head (e.g. loss computation, reverse vocabulary lookup, exported outputs for model serving, etc. which should be moved in the decoder itself)