OpenAI's Motion to Dismiss Copyright Claims Rejected by Judge (arstechnica.com)

(Saturday April 05, 2025 @11:34AM (EditorDavid) from the prompt-ruling dept.)

Is OpenAI's ChatGPT violating copyrights? The New York Times sued OpenAI in December 2023. But [1] Ars Technica summarizes OpenAI's response . The New York Times (or NYT) "should have known that ChatGPT was being trained on its articles... partly because of the newspaper's own reporting..."

> OpenAI pointed to a single November 2020 article, where the NYT reported that OpenAI was analyzing a trillion words on the Internet.

>

> But on Friday, [2]U.S. district judge Sidney Stein disagreed , denying OpenAI's motion to dismiss the NYT's copyright claims partly based on one NYT journalist's reporting. In his opinion, Stein confirmed that it's OpenAI's burden to prove that the NYT knew that ChatGPT would potentially violate its copyrights two years prior to its release in November 2022... And OpenAI's other argument — that it was "common knowledge" that ChatGPT was trained on NYT articles in 2020 based on other reporting — also failed for similar reasons...

>

> OpenAI may still be able to prove through discovery that the NYT knew that ChatGPT would have infringing outputs in 2020, Stein said. But at this early stage, dismissal is not appropriate, the judge concluded. The same logic follows in a related case from The Daily News, Stein ruled. Davida Brook, co-lead counsel for the NYT, suggested in a statement to Ars that the NYT counts Friday's ruling as a win. "We appreciate Judge Stein's careful consideration of these issues," Brook said. "As the opinion indicates, all of our copyright claims will continue against Microsoft and OpenAI for their widespread theft of millions of The Times's works, and we look forward to continuing to pursue them."

>

> The New York Times is also arguing that OpenAI contributes to ChatGPT users' infringement of its articles, and OpenAI lost its bid to dismiss that claim, too. The NYT argued that by training AI models on NYT works and training ChatGPT to deliver certain outputs, without the NYT's consent, OpenAI should be liable for users who manipulate ChatGPT to regurgitate content in order to skirt the NYT's paywalls... At this stage, Stein said that the NYT has "plausibly" alleged contributory infringement, showing through more than 100 pages of examples of ChatGPT outputs and media reports showing that ChatGPT could regurgitate portions of paywalled news articles that OpenAI "possessed constructive, if not actual, knowledge of end-user infringement." Perhaps more troubling to OpenAI, the judge noted that "The Times even informed defendants 'that their tools infringed its copyrighted works,' supporting the inference that defendants possessed actual knowledge of infringement by end users."

[1] https://arstechnica.com/tech-policy/2025/04/judge-doesnt-buy-openai-argument-nyts-own-reporting-weakens-copyright-suit/

[2] https://cdn.arstechnica.net/wp-content/uploads/2025/04/NYT-v-OpenAI-Opinion-4-4-25.pdf

Is it copying their work though? (Score:3)

by Talon0ne ( 10115958 )

It seems to me like AI is just sort of 'ingesting' content, internalizing it, and building its world view based on it... Just like a person would. No word for word copying is going on (otherwise the model would be many many terabytes)... So IMHO this should be dismissed, plain and simple.

Re: (Score:2)

by Valgrus Thunderaxe ( 8769977 )

If I read the *same* material, freely available on the web, and use it to form some type of world-view or intellectually enrich myself, and then use that information to start a business, for example, is this somehow different? I'm not convinced this is the case.

Re: (Score:2)

by Retired Chemist ( 5039029 )

You are not making commercial use of their content. If you start a competing business using their content, you might well be sued.

Re: (Score:2)

by StormReaver ( 59959 )

OpenAI's defense at this point is geared around minimizing damages. They have already lost the infringement, and they know it. The Times has a very strong case, including willful infringement, so the only question is what the penalties will be. All that's happening now is going through the motions to the guilty verdict. The Times will have to screw up royally in order to lose.

Re: (Score:2)

by Talon0ne ( 10115958 )

So every news station that reports "The New York Times today said XYZ" should fall in the same bucket because they are doing the SAME THING... They are reading their content and parroting parts of it to their audience. How is this any different? If OpenAI sited the NYT would that be better?

Re: (Score:2)

by Jeremi ( 14640 )

> So every news station that reports "The New York Times today said XYZ" should fall in the same bucket because they are doing the SAME THING...

That's an interesting analogy, but flawed because the news station is only referring to the New York times article, not pretending that it's their own reporting.

A better analogy might be a person (or company) that buys a copy of the New York Times every day, then rewrites all the articles by hand into their own words, and publishes the rewritten articles as a "competing newspaper" at a lower price, because they didn't have to pay for any actual reporting or information gathering, only for rewriting. Dunno

Re: (Score:2)

by Valgrus Thunderaxe ( 8769977 )

What If I look a recipe published in the NYT and made a dish based on that at my hypothetical restaurant?

Or, I improved my property based on an article in Home and Garden, and then subsequently rented out the property?

All the arguments seem to be that someone else is making money that isn't them. That seems more like jealously or a lack of imagination on the part of the first-party.

Re: (Score:3)

by ChunderDownunder ( 709234 )

Set booby traps as a form of watermarking.

Companies will start deliberately seeding articles with fake news, grammatical oddities, made up words and other forms of digital subterfuge. Much like dictionaries and mapmakers used to insert phantom content to detect copying.

Then when bots scrape your content, you can show the judge the fingerprinting you inserted.

You may inadvertently invent a whole new vocabulary but once you've draffered the April sneggleklergen, you're past the point of no return.

Re: (Score:2)

by Retired Chemist ( 5039029 )

That is what the lawsuit is about. What uses are permitted under the copyright law and what are not. Since no one had heard of LLMs when the law was written, this is open to debate. Since as far as I know, none of us are copyright lawyers, are opinions are basically of no value.

Lawyers do not understand LLMs (Score:2)

by ihadafivedigituid ( 8391795 )

> OpenAI should be liable for users who manipulate ChatGPT to regurgitate content in order to skirt the NYT's paywalls ...

LLMs are notoriously bad at verbatim retrieval, and the notion that someone would use ChatGPT or whatever to read the NYT is the stupidest thing I'm likely to read on an unusually stupid news day.

The New York Times is a drop, or at most a bucket, in the ocean of training material used to generate a vast soup of vectors. This is profoundly transformative: it's not a .zip file of NYT articles being published, it's the combined influence of a myriad of sources--including the New York Times. If this isn't fa

Re: (Score:2)

by StormReaver ( 59959 )

> Google won, and if there's any consistency at all then the LLM trainers will win too.

Google Library Project and LLM trainers aren't even remotely similar, so it would be incredibly inconsistent for OpenAI to prevail. At the very least, Library Project shows only a snippet of a book. It then points users to legitimate purchasing options rather than charging users for access to material for which Google has no legal rights. This does not infringe on the rights-holders ability to monetize their rights. OpenAI, on the other hand, copies the entirety of such material, then directly charges for a

News: 0176930409

OpenAI's Motion to Dismiss Copyright Claims Rejected by Judge (arstechnica.com)

Is it copying their work though? (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Lawyers do not understand LLMs (Score:2)

Re: (Score:2)