facebookresearch/fairseq

Expand MMS: Request for Lezgi Language Support in ASR and TTS

Open

#5,247 opened on Jul 9, 2023

View on GitHub
 (0 comments) (8 reactions) (0 assignees)Python (29,107 stars) (6,224 forks)batch import
enhancementhelp wantedneeds triage

Description

🚀 Feature Request

I am proposing the addition of Lezgi language support to the MMS language model for ASR and TTS.

Motivation

The motivation for this proposal is the absence of Lezgi language in the current list of supported languages. Lezgi is spoken by more than 800,000 people primarily in southern Dagestan and northern Azerbaijan. Its inclusion would make your language model more comprehensive and accessible to a wider audience. Also, as there is a significant amount of digital content available in Lezgi language, including it would be beneficial in terms of data diversity and quality.

Pitch

In the current MMS language model for ASR and TTS, over 1000 languages are supported, which is indeed remarkable. However, the inclusion of Lezgi language would further enrich this diverse dataset, and cater to the linguistic needs of the Lezgi speaking community, thereby contributing towards a more inclusive digital space. It will also aid in digitization efforts and computational linguistic research involving the Lezgi language.

Several applications and websites focusing on the Lezgi language could potentially utilize your tool if Lezgi language support is included, furthering the reach and utility of your language model.

Contributor guide