The impact of AI on royalties

30/6/25

X min

Since the launch of GPT-4o, capable of generating images in the style of Studio Ghibli from simple text descriptions—social media has been buzzing. Around the world, users are reworking their photos in the Ghibli style, reflecting a genuine global craze.

But behind these AI-generated images lies a deeper, critical question for the publishing industry: What are these models trained on, and to whom are they accountable?

Training an AI on protected works, even just to reproduce a style, could potentially constitute a violation of royalties.

This issue of AI in publishing is far from theoretical: legal proceedings are already underway. In late 2023, The New York Times filed a lawsuit against OpenAI and Microsoft for unauthorized use of its content. Others, like Le Monde and AFP, opted to sign licensing agreements with OpenAI.

At stake is a legal and economic battle over control of textual data, the true raw material of artificial intelligence.

The European AI Regulation (AI Act), coming into force in August 2025, requires AI developers to be more transparent about their training data and to fully respect intellectual property rights.

In practice, however, large-scale enforcement remains unclear, and publishing professionals are asking key questions: What data is being used? What happens to a work once it’s absorbed into an AI model? And most importantly, how can we protect authors and ensure fair royalties in this new digital ecosystem?

💡 Want to learn more about AI applications in publishing? Explore our 2025 AI guide for publishers.

‍

Do AI tools use data protected by royalties?

In a word: yes.

‍

AI tools have been trained on data protected by royalties, as highlighted in this excerpt from OpenAI’s report presented to the UK House of Lords Communications and Digital Committee in 2024:

‍

"Given that copyright today covers virtually all forms of human expression, it would be impossible to train today’s leading AI models without using royalty-protected material. Restricting training data to public domain books and illustrations created over a century ago might make for an interesting experiment—but it would not enable the development of AI systems that meet the needs of today’s citizens."

‍

OpenAI Report

‍

Royalty-protected data: a goldmine for AI tools

‍

Data is a true treasure trove for artificial intelligence tools, and it plays a critical role in their performance. Without a steady stream of fresh or creative data, AI systems risk degrading, leading to biased or inaccurate outputs.

‍

Is It Legal?

Directive (EU) 2019/790 on Copyright and Related Rights in the Digital Single Market allows text and data mining (TDM) for scientific research and includes an exception for all AI-related uses—even commercial ones, with no compensatory remuneration for rights holders, provided that an opt-out mechanism is respected.

💡 This directive was finalized before the rise of generative AI. At the time, no one could have foreseen the rapid emergence of technologies like ChatGPT (launched in late 2022) or the profound legal and economic disruptions they would bring.

‍

Key Provisions of Directive (EU) 2019/790 of the European Parliament and of the Council of 17 April 2019

‍

a) Article 15 (formerly Article 11), Neighbouring rights for press publishers

Establishes a neighbouring right for press publishers, granting them specific legal protection over their publications
Forces platforms such as Google News and Facebook to negotiate and pay publishers for the reuse of article excerpts

b) Article 17 (formerly Article 13), Increased platform liability

Holds platforms legally accountable for royalty-protected content uploaded by their users

They must obtain proper licenses or swiftly remove illegal content under risk of penalties

While debates often focus on the training data used in AI models, another, less visible but equally critical issue deserves the attention of publishing professionals: the content of the prompts users enter into these tools every day.

Prompt-related risks

It’s essential to exercise caution with the prompts you submit to AI tools.

What is a prompt?

‍

A prompt is an instruction or question given to an artificial intelligence system to generate a response. It guides the AI in producing content—whether text, images, or code. The more precise the prompt, the more relevant the output.

‍

Specific risks associated with prompts:

‍

Accidental disclosure of confidential information : for example, a team member might submit a prompt like, “How can we improve the strategic priorities of my publishing house in 2025?” If the request includes internal details, those could later be surfaced to other users. An employee at a competing publishing house might unknowingly receive responses derived from that sensitive data

Disclosure of manuscripts : in an attempt to edit a manuscript, a user might paste an entire unpublished work into the AI. Portions of this text could then be reused by the model in future outputs

Lack of control over storage : once a prompt is submitted, its storage location and duration are unclear. If the AI system experiences a security breach or data leak, that information could resurface

Malicious use of public AI tools : an external user might attempt to extract sensitive information from the AI, information it previously learned through other prompts, by crafting clever queries or combining different requests

‍

Comment faire pour que mes contenus ne soient pas utilisés par des IA?

Les ayants droit peuvent toutefois s’opposer à la directive TDM via l'« opt-out ». Le décret du 23 juin 2022 précise que cette opposition peut être exprimée de manière simple, par exemple, via des métadonnées ou des conditions d'utilisation d'un site.

Pour faire simple, si vous n’utilisez pas l’opt-out, il est considéré que vous ne vous opposez pas à l’utilisation de votre contenu par les modèles IA.

‍

How can I display the opt-out to protect my content from AI use?

Here are some recommandations and templates that you can use and customize for inclusion in your website’s terms and conditions to prevent your content from being exploited by AI systems.

‍

The limits of opt-out as a protection mechanism

‍

The opt-out system presents three main limitations:

Lack of enforceability

It is difficult to verify whether an AI model actually respects the opt-out. There are no guarantees regarding the use of data that may already have been collected.

‍

Presumption of authorization by default

The system reverses the burden of proof. Instead of requiring prior consent (opt-in), rights holders must proactively express their opposition to the use of their content.

‍

Potential infringement of moral rights

Under French law, moral rights protect the integrity of a work and the identity of its author. If AI generates content that misrepresents an author's intentions or contradicts their values, it may constitute a violation of their moral rights—such as damage to their reputation or the distortion of their creation.

How can beneficiaries be remunerated in the age of AI tools?

‍

Discussions are still ongoing to determine the value of the data used by AI models.According to preliminary studies, the potential remuneration mechanisms will depend on the degree of regulation adopted by European states.

‍

‍

The first remuneration mechanism to emerge organically has been the negotiation of contractual agreements, particularly in the form of content licenses.This is the path chosen by several major media players, including

‍The Washington Post and Reddit. These bilateral partnerships offer a win–win solution: they allow publishers and institutions to monetize their content while providing AI developers with legal, direct, and high-quality access to the data needed to train their models.

Examples of strategic partnerships between publishers/news outlets and AI model providers for data access

‍

The first remuneration mechanism to emerge organically has been the conclusion of contractual agreements, particularly in the form of content licenses. Several major news organizations are now collaborating with AI giants. The Financial Times, Axel Springer, and News Corp have all entered into agreements with ChatGPT.

In November 2024, HarperCollins reached a deal with a tech company, identified as Microsoft, allowing the use of certain titles in training artificial intelligence (AI) models. The proposal includes a fixed, non-negotiable payment of $2,500 per book for authors who agree to the terms, covering a three-year usage period.

The initiative sparked mixed reactions among authors, primarily due to concerns about the level of compensation offered.

Conversely, can AI-generated content be protected by royalties ?

‍

UK law, under Section 9(3) of the Copyright, Designs and Patents Act 1988, states that for computer-generated works (i.e. works with no human author), "the author shall be taken to be the person by whom the arrangements necessary for the creation of the work are undertaken."

In the context of AI-generated content, this might refer to the person who designed the system or input the prompt, but this attribution is debated, especially when the creative process is largely autonomous.

However, the protection is limited:

It lasts for 50 years from the end of the year in which the work was made, rather than the usual 70 years after the author's death.
The standard of originality is lower than in traditional copyright, but it remains unclear whether outputs from generative AI truly qualify.

In practice, fully AI-generated content, created without human artistic input, is unlikely to enjoy strong or enforceable copyright protection under UK law. Ownership and rights over such content remain a grey area and may require further legislative clarification

‍

In the United States, the Copyright Office revoked protection for the comic book Zarya of the Dawn, which featured AI-generated illustrations, on the grounds that only human-created works are eligible for copyright protection.

Aux États-Unis, le Copyright Office a révoqué la protection d'une bande dessinée illustrée par IA Zarya of the Dawn, estimant que seules les créations humaines peuvent être protégées.

‍

What should you do if AI-generated content resembles a protected work?

‍

An AI tool might generate an image that closely resembles an existing poster or mimics the distinctive style of a specific artist. If a visual or textual work produced by AI strongly resembles a pre-existing creation, it may raise significant legal risks.

Here’s an example: we explicitly asked ChatGPT to generate a book cover in the style of Van Gogh. This type of request illustrates the broader.

‍

‍

Lawsuits are already underway. Illustrators Sarah Andersen, Kelly McKernan, and Karla Ortiz have filed a lawsuit against Stability AI, alleging copyright infringement. They argue that their protected works were used without consent to train the generative AI system Stable Diffusion.

‍

AI and royalties: a delicate balance yet to be achieved

‍

The rapid rise of AI tools in publishing is disrupting long-established balances in the field of royalties. Generative models produce text and images from often protected content, without informing or compensating the original authors. Some publishers are engaging through well-defined partnerships, while others are turning to legal action. Legal uncertainty still prevails.

The European Union is attempting to address the issue with the EUCD directive and the AI Act, which aim to enforce greater transparency. Yet, how these regulations will be applied in practice remains unclear. The opt-out mechanism, though useful, is imperfect: it relies on exclusion rather than consent, and lacks effective enforcement tools.

In this context, creative data is becoming a strategic asset. The challenge goes far beyond legal boundaries, it is also cultural, economic, and ethical. Authors, publishers, and institutions must rethink how they share and protect their content, at a time when AI is emerging as a new player in the creative ecosystem.

‍

📘 Want to go further? Discover our practical tips and recommendations in our summary table of AI tools for publishing.

‍

👋 Crealo, the royalty management software, helps you automate your royalty processes and generate tailored royalty statements. Get in touch, our team will be happy to support you in simplifying your royalty reporting.

‍

Posted by