The code will look to strike a balance between copyright holders and generative AI firms so that both parties can benefit from the use of copyrighted material in training data
Sebastian Klovig Skelton,
Published: 17 Mar 2023 14:15
The UK government has committed to creating a code of practice for generative artificial intelligence (AI) companies to facilitate their access to copyrighted material, and following up with specific legislation if a satisfactory agreement cannot be reached between AI firms and those in creative sectors.
In July 2021, the government outlined its plans to create “pro-innovation” digital regulations, and has since taken this approach forward into various legislative proposals, including its Data Protection and Digital Information Bill and plan to create a new framework for AI technologies.
In a review of how the proposed “pro-innovation” regulations can support emerging digital technologies, Patrick Vallance said the government should create a clear policy position on the relationship between intellectual property law and generative AI.
Since the start of 2023, a spate of legal challenges have been initiated against generative AI companies – including Stable Diffusion, Midjourney and the Microsoft-backed Open AI – over alleged breaches of copyright law arising from their use of potentially protected material to train their models.
In his review, Vallance said that enabling generative AI companies in the UK to mine data, text and images would attract investment, support company formation and growth, and show international leadership.
“If the government’s aim is to promote an innovative AI industry in the UK, it should enable mining of available data, text, and images (the input) and utilise existing protections of copyright and IP law on the output of AI. There is an urgent need to prioritise practical solutions to the barriers faced by AI firms in accessing copyright and database materials,” it said.
“To increase confidence and accessibility of protection to copyright holders of their content as permitted by law, we recommend that the government requires the IPO [Intellectual Property Office] to provide clearer guidance to AI firms as to their legal responsibilities, to coordinate intelligence on systematic copyright infringement by AI, and to encourage development of AI tools to help enforce IP rights.”
Responding to the review, the government said the IPO will be tasked with producing a code of practice by summer 2023, “which will provide guidance to support AI firms to access copyrighted work as an input to their models, whilst ensuring there are protections (e.g. labelling) on generated output to support right holders of copyrighted work”.
It added that the IPO will convene a group of AI firms and rights holders to “identify barriers faced by users of data mining techniques”, and that any AI firm which commits to the code “can expect to be able to have a reasonable licence offered by a rights holder in return”.
Cory Doctorow, author and activist
The IPO will also be tasked with coordinating intelligence on any systematic copyright infringement and encouraging the development of AI tools which assist with enforcement of the code.
The government further claimed that this would allow both the AI and creative sectors “to grow in partnership”, but said any potential code of practice may be followed up by legislation if it is not adopted, or if an agreement between the sector cannot be reached.
On 16 March 2023, the US government also published its own policy statement on generative AI and copyright, which noted that “public guidance is needed” because people are already trying to register copyrights for work containing AI-generated content.
However, it exclusively focuses on whether material produced by AI, where the “technology determines the expressive elements of its output”, can be protected by copyright, rather than the access of generative AI firms to others’ copyrighted material.
“In the case of works containing AI-generated material, the office will consider whether the AI contributions are the result of ‘mechanical reproduction’ or instead of an author’s ‘own original mental conception, to which [the author] gave visible form’,” it said. “The answer will depend on the circumstances, particularly how the AI tool operates and how it was used to create the final work. This is necessarily a case-by-case inquiry.”
It added that copyright applicants also have a duty to “disclose the inclusion of AI-generated content in a work submitted for registration”.
In their November 2022 book Chokepoint capitalism: How big tech and big content captured creative labor markets and how we’ll win them back, Rebecca Giblin and Cory Doctorow argue that while copyright is more prolific and profitable than ever, creators themselves are not necessarily receiving those profits.
“The sums creators get from media and tech companies aren’t determined by how durable or far-reaching copyright is – rather, they’re determined by the structure of the creative market,” wrote Doctorow in a blog.
“The market is concentrated into monopolies. We have five big publishers, four big studios, three big labels, two big ad-tech companies, and one gargantuan ebook/audiobook company…Under these conditions, giving a creator more copyright is like giving a bullied schoolkid extra lunch money.”
He added that while the massive expansion of copyright over the past four decades has made the entertainment industry larger and more profitable, “the share of those profits going to creators has declined” both in real terms and proportionately.
“Some of the loudest calls for exclusive rights over ML [machine learning] training are coming not from workers, but from media and tech companies. We creative workers can’t afford to let corporations create this right,” he said.
“Turning every part of the creative process into ‘IP’ hasn’t made creators better off. All that it’s accomplished is to make it harder to create without taking terms from a giant corporation, whose terms inevitably include forcing you to trade all your IP away to them.”