kyutai-labs Repositories

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Last commit May 16, 2026

(10,347 stars) (966 forks) (0 indexed issues) (0 open good first issues)

Get fresh easy issues in your inbox.