ggml-org/whisper.cpp

max-len as a range or 'suggestion' to reduce dangling tokens

Open

Aperta il 27 feb 2023

Vedi su GitHub
 (1 commento) (2 reazioni) (0 assegnatari)C++ (49.693 star) (5535 fork)batch import
enhancementgood first issue

Descrizione

suggestion that the max-len argument allow for some ambiguity of +- words/tokens or a nargs range instead of a hard int, so that the output files don't dangle a single word/token on successive timestamps.

Ex: Given the following output, it would probably be preferrable that the 2nd and 5th timestamp (there/job) would appear with the previous timestamp, possibly hard-wrapped to a second line in the cue, rather than alone.

` [00:06:23.160 --> 00:06:27.000] The other district where Pat was over

[00:06:27.000 --> 00:06:27.000] there,

[00:06:27.000 --> 00:06:28.380] also they know her.

[00:06:28.380 --> 00:06:31.680] So I think they were doing a very good

[00:06:31.680 --> 00:06:32.400] job.

[00:06:32.400 --> 00:06:35.750] - So only those two candidates were the

[00:06:35.750 --> 00:06:36.160] only ones `

Guida contributor