Press Release

New capabilities expand uploader control over access and compensation, while helping developers discover more representative datasets

Share
Press Release

LONDON, MAY 19, 2026 Mozilla Data Collective, a data platform redefining how AI data is created, shared, and governed, enters its next phase as a standalone, mission-led entity based in the United Kingdom alongside the introduction of three new platform capabilities: Request to Access, Data Assistant, and an upcoming Payments and Compensation functionality. Together, these capabilities advance Mozilla Data Collective’s vision for human agency and fair value exchange in AI data. 

A New Chapter for Mozilla Data Collective

Mozilla Data Collective was introduced in November 2025 after the first version of the platform for community datasets went live, grounded in the belief that those who create datasets should control how they are accessed and used. 

Now operating as a mission-locked British company, Mozilla Data Collective is the first social enterprise incubated by Mozilla Foundation, which has committed up to $10 million to support its development, with $5 million already deployed.

Since its soft launch, the platform has grown into a global community of uploaders and developers, with more than 190 vetted organizations sharing over 600 curated datasets across more than 300 languages. These include Hazargi literature from Afghanistan, oral histories in Mada from Cameroon, rescued government FactBooks, and Romansh newspapers from Switzerland. The datasets are already being used by thousands of public labs, journalists, researchers, and technology companies, including AI Unicorns across the UK and Europe.

“There’s a false choice in AI right now that you have to choose between respecting data ownership and building high-quality technology,” said EM Lewis-Jong, Founder and CEO of Mozilla Data Collective. “Mozilla Data Collective is built to prove that’s not true. We’re giving communities real control over their data, including how it’s accessed, governed, and what they receive in return, supporting everything from traditional licensing to emerging models like data trusts. That’s what meaningful data sovereignty looks like in practice, and it makes it easier for developers to work with more representative datasets.”

New Platform Capabilities

Mozilla Data Collective is introducing three new capabilities that give uploaders more control over how their data is accessed, how value is exchanged, and how easily datasets can be discovered.

With Request to Access, uploaders can require downloaders to submit an access request before any dataset is made available. This allows uploaders to review who is requesting access and confirm alignment with their intended use, whether for research, education, or commercial applications. Downloads are only enabled once the uploader has approved the request.

The new Data Assistant simplifies dataset discovery by allowing developers to describe their needs in plain language, whether they are searching for existing datasets or looking for help sourcing new ones. In addition to surfacing relevant matches from Mozilla Data Collective’s growing curated collection, the assistant also allows developers to request datasets they may not be able to find elsewhere, particularly in underrepresented languages, regions, and modalities.

Mozilla Data Collective will also soon introduce Payments and Compensation, allowing uploaders to set their own pricing for dataset access. Downloaders will be able to pay for a license to use the data, and uploaders will receive 100 percent of the license fee directly. Mozilla Data Collective charges downloaders a separate 5 percent platform fee to cover infrastructure and support costs, while uploaders pay nothing to use the platform.

Together, these capabilities give uploaders more control over who can access their data, under what terms, and how value is exchanged.

These capabilities are rolling out in alpha, with Mozilla Data Collective welcoming feedback, requests, and suggestions from the community at support@mozilladatacollective.com.

About Mozilla Data Collective

Mozilla Data Collective is a mission-locked British social enterprise, backed and incubated by Mozilla Foundation, building the data platform for human agency and fair value exchange. Mozilla Data Collective enables communities, organisations, and individuals to share global cultural datasets on their own terms, while helping downloaders build more representative and culturally grounded technologies with data they cannot find anywhere else. Built by the team behind Mozilla’s Common Voice, the world’s largest open, public-participation speech dataset, Mozilla Data Collective already supports more than 190 organisations sharing over 600 datasets across more than 300 languages. Learn more at mozilladatacollective.com.

MEDIA CONTACT

Max Borges Agency for Mozilla Data Collective

mdc@maxborgesagency.com