Guide
Fine-Tune a Speech-to-Text Model for Any Language - Including Yours
A step-by-step developer tutorial from Kostis at Mozilla Data Collective
AI & Data Engineer @ Mozilla Data Collective
Guide
A step-by-step developer tutorial from Kostis at Mozilla Data Collective
Guide
Overcoming the complexity of AI Mozilla Data Collective helps communities to offer unique, multilingual, multicultural, and multimodal datasets. From transcribed and translated videos of narrated Ekpeye folktales to complex question-answering text pairs for the Georgian language, the diversity of datasets on our platform is core to our mission. But with
Common Voice
Firstly, we’d like to thank you for your patience. After introducing Spontaneous Speech early in 2025, we released most locale datasets when the Mozilla Data Collective platform launched in alpha in September of this year. However, upon inspection, the English Spontaneous Speech dataset required some remedial work prior to
Docs
Interested in joining the movement and publishing your dataset on Mozilla Data Collective? This guide will walk you through the steps required, from account creation to submission!