MIT Researchers Propose 'Learnrights' to Pay Creators for AI Training
The legal ambiguity surrounding artificial intelligence training data is creating a precarious situation for content creators. Three researchers from MIT Sloan, Cornell Law School, and George Washington University Law School have proposed a new intellectual property framework called "learnrights" to address the compensation gap.
Professor Thomas Malone of MIT Sloan first introduced the concept in 2023. In a more recent paper published in the Journal of Technology and Intellectual Property, Malone and co-authors Frank Pasquale and Andrew Ting outlined how learnrights could function legally, economically, and practically. The proposal would add a seventh exclusive right to the six already granted to creators under current copyright law.
According to the MIT Sloan article, the core mechanism is straightforward: copyright holders would gain the exclusive right to license their content for AI model training. Generative AI providers would need to obtain explicit legal licenses to train their models using copyrighted material, and creators would receive fair compensation for that use.
The researchers propose an opt-in program. Creators would register their copyrighted content, and AI companies could then obtain licenses directly from creators or through literary and artistic agents. Working with agents would benefit both sides—AI companies would negotiate with fewer entities, and agents would likely have a better sense of market value than individual creators.
Current U.S. copyright law lets creators charge fees to copy their work, but it doesn't prevent humans from learning from copyrighted material and producing different content. This is generally considered "fair use." The problem is that copyright law didn't anticipate technologies like generative AI models that can learn from massive amounts of content at a scale no human could ever hope to match.
Generative AI providers claim this training is fair use. But dozens of copyright owners have sued generative AI providers, alleging that the resulting output is a derivative work that infringes on their original copyright. Past cases involving Anthropic and Meta suggest fair use is flexible and contextual. None of these cases has completely resolved the legal issues, and no generative AI cases have reached the Supreme Court.
Reporting from Cornell Chronicle highlights the moral dimensions the researchers emphasize. From a utilitarian perspective, society benefits when creative work continues to be produced—and that requires maintaining incentives for humans to keep making it. From a rights-based standpoint, tech companies vigorously protect their own intellectual property while dismissing the value of those whose work powers the models.
The researchers present three main arguments supporting compensation. First, if AI models produce high-quality content quickly and cheaply without compensating original creators, that will decrease creators' motivation to produce new content. This would reduce the volume of original work available to further improve AI models. "It would be unwise to risk such a decline in incentives for human expression," the researchers write.
Second, the researchers find it troubling that for-profit AI companies cry foul when others use their intellectual property—as was the case when U.S.-based AI firms accused China's DeepSeek of stealing from them—given that the same companies use copyrighted content without compensating its creators. The hypocrisy is palpable (though apparently not enough to stop the lawsuits).
Third, properly acknowledging how other works influenced one's own is the foundation of a thoughtful creative process. Uncredited and uncompensated use of others' work falls short of ethical standards and undermines what IP protection is supposed to mean.
Some argue that giving AI models free, broad access to content will lead to better-performing and more diverse AI systems. The authors contend that this isn't likely to work in the long term because it would erode the incentives for people to create new content in the future. They point to research suggesting that feeding models their own outputs over time can lead to "model collapse," reducing quality.
Under a learnright regime, companies building generative AI tools would license the right to learn from specific datasets—much as some already do with news archives or stock photo libraries. The authors say that market negotiations would naturally set fair rates, and that clearinghouses or collective licensing organizations could replicate successful models from the music industry.
The proposal arrives as lawmakers signal growing interest in regulating generative AI. A learnright offers a clear path for policymakers: a middle ground that neither bans training nor leaves creators uncompensated. The legal framework would not replace copyright but would add a new protection specifically addressing machine learning.
Whether AI companies will voluntarily adopt this framework before legislation forces them to remains uncertain. The technology sector has a long history of resisting new regulations until courts or Congress intervene. Creators hoping for compensation should probably prepare for a long wait rather than expecting immediate change.
Artūras Malašauskas is an AI Systems Integrator with 20+ years of production-grade web engineering experience. He has designed, shipped, and scaled enterprise Python/PHP systems for logistics, SaaS, and public-sector clients. For the past year, he has focused exclusively on AI integrations: deploying open-source LLMs, building generative media pipelines (image, audio, video), and engineering multi-agent workflows for real production environments. His standard: reproducibility, security, cost-efficient inference—no vaporware. He documents and evaluates emerging AI tooling, separating verified capabilities from marketing noise. Technical editor at: muza-ai.eu, ai-verslas.lt, ai-naujinos.lt Connect on LinkedIn
Artūras Malašauskas is an AI Systems Integrator with 20+ years of production-grade web engineering experience. He has designed, shipped, and scaled enterprise Python/PHP systems for logistics, SaaS, and public-sector clients. For the past year, he has focused exclusively on AI integrations: deploying open-source LLMs, building generative media pipelines (image, audio, video), and engineering multi-agent workflows for real production environments. His standard: reproducibility, security, cost-efficient inference—no vaporware. He documents and evaluates emerging AI tooling, separating verified capabilities from marketing noise. Technical editor at: muza-ai.eu, ai-verslas.lt, ai-naujinos.lt
Comments