About Mozilla Data Collective
What is Mozilla Data Collective’s mission?
We fight for a tech future that is multilingual, multicultural and multimodal. We think we deserve a world where you don’t need to change how you speak or how you look in order to access technology - and we welcome technology’s promise as a connector, enabler; a tool to build and make and shape.
We think the right way to get there is by giving everyone, everywhere the data platform for human agency and fair value exchange. People should be able to choose where their datasets show up, and they should be able to define what it looks like to benefit; whether that’s swapping data for tool access, or for expertise, donating it to the public, or asking for fair compensation.
You can share openly, using existing licenses like Creative Commons, or you can build your own license. You can open up your datasets for everyone, or just for some types of downloaders, you can set custom constraints, ask for exchange, compensation or recognition. You can govern the dataset as an individual, a co-operative, a trust or something else. After all, it’s your dataset. The people who access your datasets are fully authenticated, and held in legally binding contracts, and we have a number of dataset protection features.
What is the history of Mozilla Data Collective?
In 2025, our Founder and CEO E.M. Lewis-Jong, was leading Common Voice (the world’s largest public participation speech dataset) at Mozilla Foundation, and was looking for a release platform that would give Common Voice communities more choice: choice of license, features that undergirded stronger control, and a radically anti-extractivist form of value exchange.
The team couldn’t find that platform, so in September we built Mozilla Data Collective. Common Voice was Mozilla Data Collective’s first community user, piloting the platform for its own datasets.
Along the road, we had met hundreds of fellow travellers with the same problem; trying to share their data on their own terms, in line with their own values. So in November, we opened up Mozilla Data Collective to close friends, partners and allies. As of April 2026, Mozilla Data Collective has 187 organisations vetted to share datasets on the platform. We work with organisations from libraries, archives and museums, to tech start ups in health, education and language tech.
In April 2026, we spun out a UK entity dedicated to housing that work - Mozilla Data Collective. Common Voice continues to be stewarded by Mozilla Foundation.
What is the business model of Mozilla Data Collective?
Mozilla Data Collective is structured as a British company, incubated and backed by Mozilla Foundation, a non-profit that fights for alternative digital futures, and makes good tech the norm. This gives it the flexibility of a company, with the mission lock of a non-profit. In order to facilitate the Mozilla Data Collective community and organisations seeking to set their own value exchange options - including compensation - we needed to be able to transact payments, and provide support services. We chose to establish this organisation in Europe, where data protection is gold standard.
Our uploaders keep 100% of whatever they choose to charge (if they choose to charge - most of our datasets are open source). Our business model is to charge the downloader a modest 5% fee, which covers the costs of our storage, infra, maintaining APIs. We will launch a premium subscription later in the year with more sophisticated tools to discover, curate and package datasets, which will also be progressively priced depending on organisation size, with free options for many. That’s it! That’s the model.
Being a mission-driven company is not just a necessity, it’s a conscious choice. We are a social enterprise, and proud to be so. In a world where grant funding can disappear in line with political realities, we want to be firmly self-sustaining and independent. We don’t want to cede the multilingual data space (in which we’ve been working since 2017) to dubious marketplace brokers who are gate-keeping data buyers in order to carve out hefty cuts for themselves, or to workforce vendors driving precarious gig work at knockdown prices.
We think that people deserve a universe of fair data exchange, of collective bargaining, where they’re in control. Our communities don’t give up their datasets, they invest them as a lever for change. Mozilla Data Collective is structured to give them that platform.
Why did Mozilla decide to invest philanthropic capital in the data space?
Mozilla Foundation's job is to spot the structural fights that will define whether technology serves everyone or just the few. We've been in the data space since 2017 with Common Voice, long before "AI" became a buzzword in every deck. And that’s why we know that for too long, the data economy has looked like a digital land grab: extractive, opaque, and frankly, lazy. We saw a market failure, where the people who actually create the value of AI were being treated as a resource to be mined rather than partners to be respected. We’re in the innovation business, not human fracking, so we wanted to resource an alternative. That became ever more urgent as we saw that whoever controls the data layer controls the AI future.
Mozilla Data Collective is our bet that you can build sustainable infrastructure for fair data value exchange, in a way that maximises innovation, no matter your geography or industry proximity. That was a problem worth solving with patient, mission-locked capital – that is, exactly the kind of bet that philanthropy should be making.
Mozilla Data Collective is a platform in the truest sense. It’s yours to stand on, and make of it what you will. Mozilla Data Collective works by allowing you to share your data, retain ownership of it, and control who uses it.
We imagine and create a better future where AI is built equitably and powered by the people. We do this by providing alternative solutions that challenge extractive data practices by placing the power of how AI data is created and governed in the hands of the people.
Find Us Around the Web:
r/MozillaDataCollective - Reddit