In a landmark decision that’s shaking up the tech and publishing industries, a U.S. federal judge has ruled that artificial intelligence (AI) models can be trained using pirated or copyrighted books — at least under current copyright law. This controversial judgment has reignited debates around intellectual property, fair use, digital rights, and the responsibilities of tech companies building large language models (LLMs) like ChatGPT.

This ruling may well set the tone for future legal interpretations of how copyrighted materials are treated in the age of AI.


BACKGROUND: HOW DID WE GET HERE?

Over the past few years, AI models have exploded in popularity and usage. These models, including OpenAI’s GPT-4, Meta’s LLaMA, and Google’s Gemini, are trained on vast datasets that often include billions of pages scraped from the internet — some of which contain copyrighted works, academic journals, and even entire books uploaded without permission.

This became a major point of contention when a group of authors — including big names like Sarah Silverman, Paul Tremblay, and Mona Awad — sued OpenAI and Meta. Their argument? Their copyrighted books were illegally used as training data without consent, compensation, or credit.

In particular, the lawsuits targeted databases such as Books3, a massive unauthorized collection of over 190,000 books, many of them still under copyright.


THE COURT’S DECISION

In June 2025, a federal judge in the U.S. District Court ruled in favor of the AI companies, stating that using copyrighted books — even ones pirated and hosted without permission — for training AI models qualifies as “fair use.”

The court acknowledged that the books were not distributed directly to users, nor were they republished or sold in any form. Instead, they were transformed through statistical learning and data processing into general language patterns.

According to the judge:

“The AI does not retain or reproduce the copyrighted material in a recognizable form. The usage is transformative in nature and serves a different purpose than the original works.”


AUTHORS REACT: BETRAYAL OR LEGAL OVERSIGHT?

Not surprisingly, the backlash from authors and publishers has been swift and intense.

The Authors Guild called the decision “a betrayal of creative labor.” Many fear that this ruling gives tech companies free rein to scrape and use their work without any compensation.

Well-known author Douglas Preston remarked:

“If a machine can digest my work without credit or payment and then compete with me by writing in my voice, how is that not theft?”


WHAT DOES THIS MEAN FOR AI DEVELOPERS?

On the flip side, AI researchers and developers see this as a huge win.

Training AI models requires massive datasets. If every piece had to be licensed, only tech giants could afford to train them. This decision clears the way for faster, more affordable innovation.

Tech legal expert Jennifer Stiles explained:

“It’s a balancing act between the rights of creators and the need for innovation. Courts are recognizing that AI doesn’t simply copy — it learns.”


THE FAIR USE ARGUMENT EXPLAINED

The court leaned heavily on the four factors of fair use under U.S. copyright law:

  1. Purpose and Character of the Use
    → The court found the AI’s use transformative, not duplicative.
  2. Nature of the Copyrighted Work
    → Books are creative, but transformative use outweighed this.
  3. Amount and Substantiality
    → While entire books were used, the AI didn’t reproduce them.
  4. Effect on the Market
    → No clear evidence showed that AI hurt book sales directly.

INTERNATIONAL IMPLICATIONS

Just because it’s legal in the U.S. doesn’t mean it’s legal everywhere.

In Europe, the EU AI Act and Text and Data Mining rules restrict the use of copyrighted works unless the creators give explicit permission.

Other countries like Canada, Australia, and India may interpret this very differently, making global compliance a challenge.


WHAT HAPPENS NEXT?

This case will likely be appealed, and the issue could reach the U.S. Supreme Court.

In the meantime, here’s what to expect:

  • More lawsuits from authors and publishers
  • Voluntary licensing deals between AI companies and content creators
  • Government regulation that clearly defines boundaries for training data

CONCLUSION: A NEW ERA OF AI VS. COPYRIGHT?

This ruling is a game-changer for the future of AI and content ownership.

It highlights a growing divide between creative industries and AI companies — one where courts are still figuring out how to balance innovation with protection.

The fight is far from over. But one thing is clear: the way we treat intellectual property is being rewritten by the power of machine learning.

Leave A Comment

Your email address will not be published. Required fields are marked *