How Open Licensing is Changing with AI: The NOODL License
Author: Alek Tarkowski
For the last twenty five years, standardized open licenses were increasingly seen as a main tool for democratizing access to knowledge. Over less then a decade, a relatively narrow set of canonical choices emerged: the Creative Commons licensing stack https://creativecommons.org/licenses/list.en, coupled with a few Open Data Commons licenses.
The open movement focused on promoting these standardized licensing tools as means of ensuring access, with as little friction as possible, and with potentially great public value. Since the release of Creative Commons Zero in 2009, no major new licensing options have been designed. Further innovation in licensing did not occur, because it was not needed. And open advocates for at least a decade had a sense that the work of developing sharing frameworks is finished.
Things have changed several years ago. The reuse of 100 million openly licensed photographs for training early AI models was the first warning sign. Since then, the emergence of AI-related uses of open content has triggered a new wave of licensing innovation.
The Nwulite Obodo Open Data License (NOODL) is one of such licensing experiments: a new license that combines open sharing with tiered use conditions. It is a license crafted by a group of African legal scholars to suit the needs of specific linguistic communities in Kenya. It is also an effort to address inequalities related to open sharing and content reuse that are increasingly visible and felt, across Africa and the Global South.
The NOODL license should be of interest to all open practitioners, data governance experts and stewards of the commons, even though for now it is used to license just a single dataset: DhoNam, a speech corpus for Dholuo, an indigenous language from Kenya. It is important because it points to a shift in open licensing approaches. Instead of depending on a set of standardized licensing options you can design your own license, community-centered and tailored to local needs. And in doing so, you can aim for ensuring greater equity in a world where open sharing makes you increasingly prone to asymmetries of power.
Making open licensing equitable
The NOODL license, and other recent licensing experiments - such as the Responsible AI Licenses (RAIL) - can be a source of unease to proponents of open sharing. First, because they put into question the canonical - by now - assumption that standardized open sharing is the “one size fits all” approach for ensuring access to knowledge. And second, because in the name of equity and responsible use, they introduce various conditions and limitations upon open access and reuse. In doing so, they challenge the assumption that more openness is always better. Naturally, open licensing always included several optional limitations (specifically, the Non-Commercial and No Derivatives conditions), but they were seen by many as results of problematic, unnecessary compromises.
The new wave of licensing, of which NOODL is part, goes further, by introducing much stricter limitations. In the case of NOODL, free reuse is available only for those people and entities located in developing countries. For those in the developed world (based on OECD criteria), sue requires meeting additional obligations aimed at “benefit sharing”: various forms of reciprocity, including payments. Other licenses, such as RAIL, introduce strict limits on types of allowed use, for example banning military uses.
This is often met with the criticism that this is yet another enclosure of the commons, a shift from open principles towards proprietary approaches to intellectual property. This view might have been correct, if openness was the only principle that we should be caring about, as supporters of the commons. But that should not be the case.
A different approach is proposed by the experts and communities that co-designed NOODL. It is an approach that balances openness with the need for equity and social justice. In the words of Dr. Melissa Omino, “It's not that they're saying they don't want to share or they don't want to be open. They're saying that the way open is existing right now is actually harming them.” And the reason for that are power asymmetries and concentrations that are experienced not just by Dholuo speakers, but across the world.
There is a paradox to open sharing, as mechanisms that help challenge power are also prone to enabling its concentrations. Dr Chijioke Okorie, one of the NOODL legal experts, explains that open licensing appears fair, but enabling commercial reuse without any form of reciprocity reinforces “long-standing power imbalances between well-resourced actors in the Global North and under-resourced researchers on the continent”. The largest global companies are best positioned to benefit from open resources, and to use them to further consolidate power.
Sarah Pearson from Creative Commons recently noted that “We cannot respond by accepting these risks and harms as inherent and inevitable costs of public sharing knowledge”. When it comes to AI development, democratization must mean not just availability, but also broadening capacity to develop these systems. The aim of NOODL is not to share linguistic data with the world - but to create leverage, with which African communities can build their own AI tools, serving their own, local needs.
What does this mean for open sharing? The standardized open licensing stack is a huge achievement of the access to knowledge movement. It is the backbone of many sharing solutions, and remains suitable in many conditions - for example, for sharing data and resources by public institutions. But they should be seen as just one part of a bigger toolbox of commons-based tools. And by paying more attention to the idea of the commons, we can focus on the role of collective decision making, and content of governance - which has been at the heart of the most successful free knowledge projects, like Wikipedia.
Recently, Trebor Scholz and Mark Esposito proposed argue that we need a solidarity ecosystem for AI, which combines sharing with cooperative ownership and equity as a foundational principle. The NOODL license fits well within this alternative AI development stack. Both share the assumption that knowledge layer should not just be open, but more importantly collectively governed, and thus community-centered.
This is an important reformulation of the purpose of open sharing, which may be controversial to some. It’s important to understand that NOODL is neither a proprietary nor a closed license. The tiered access model introduces limitations and reciprocal mechanisms to ensure that sharing is equitable, and beneficial to the data community.
The challenges ahead - and how to overcome them
Deployment of NOODL is not without challenges, which are mainly related to enforcement of additional licensing conditions. This is a major issue, and one shared by all other sharing mechanisms that aim to introduce conditionalities, limitations or forms of reciprocity. If a company violates the license terms, what recourse do communities have? This question is not unique to NOODL, but it is particularly acute for licenses centered on benefit-sharing with the community. For now, no obvious solutions have been identified. Instead, there is a sense of a “free for all” when it comes to use of openly shared content - as anything publicly available on the web becomes scraped, often with disregard for any norms or conditions. This is “permissionless innovation” taken to the extreme. Any way forward will require stronger collective norms and enforcement - stewards of the DhoNam dataset will not be able to enforce any rules on their own.
Another major challenge relates to license incompatibility and the risk of fragmentation of the open ecosystem. Licensed tailored to needs of specific communities lose the advantage of standardized sharing, which was so important for the scaling of open licensing. One could argue, that NOODL, while focusing on the needs of the data community, paid less attention to the broader digital commons. Here, the solution lies in retaining some level of standardization of licensing options. Hopefully, reciprocal mechanisms proposed by NOODL can be adapted to other communities and contexts.
Both challenges – enforcement and integrity of the sharing ecosystem – point to the role of alternative sharing platforms, like Mozilla Data Collective. Open licensing frameworks were built ahead of the commons-based platforms that grew in the last two decades. Today, the commons is growing around platforms like Wikimedia, Mozilla Data Collective, HuggingFace, and various context repositories. It is the infrastructural capacity of these platforms that is just as important as the legal power of licensing tools.
Authors of NOODL suggest the need to create an ecosystem that supports diverse licensing approaches—for example, a platform where the various licenses could be shared, explained, and explored by legal experts, data communities, and AI developers alike. Hopefully, the open sharing ecosystem will develop this capacity that, while centralized, would enable broader experimentation with types of data governance that establish a global knowledge commons, while supporting local communities.