News

MDC Release Notes - 30.03.26

379 new datasets with a Mozilla Common Voice update, improvements to the Python SDK (make sure you update to the latest!) and a preview of an upcoming feature. 👀

Mozilla Data Collective

30 Mar 2026 — 10 min read

Hello, Mozilla Data Collective! 👋

It's been an exciting and busy time over here as we've gotten a few ✨major ✨ updates coming your way next month. Last week, we landed a core piece of work into the code base that will unlock our first conditional access feature - individual access gating. This feature will allow uploaders to have more direct control over who is allowed to download their datasets. We've also been iterating on our Python SDK as we prepare to get uploads and submissions supported in a programmatic pipeline.

New Features & Changes

Improvements and fixes related to uploading datasets, especially large ones
We fixed a bug where draft datasheets couldn't be saved until all required fields were entered
SO much stuff I want to share now but have to wait until they're publicly available 😉

New Datasets

Anjuman e Katib

Balochistan Educational and Cultural Organization

EELLAK - GreekFOSS

Institute of African Digital Humanities

Institute of Finno-Ugric/Uralic Studies, University of Hamburg

LocaleNLP

The Mozilla Data Collective is now on the AI @ Mozilla Discord Server. Join us for announcements, community events, and more!

Join us on Discord

Mozilla Common Voice

The Mozilla Common Voice team has released the Spontaneous Speech 3.0 datasets and Scripted Speech 25.0 datasets on Mozilla Data Collective. You can find all of the Common Voice datasets available on the Common Voice organization page.