News: 1765431966

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

India’s government wants to set prices for the content AI companies use to train models

(2025/12/11)


The government of India wants AI companies to pay for accessing content they use to train models, but only once they start producing revenue.

That idea emerged yesterday in a [1]working paper [PDF] prepared by a Committee on Generative Artificial Intelligence and Copyright formed by India’s Department of Promotion of Industry and Internal Trade.

The paper notes that developers of AI models mostly didn’t pay for copyrighted content, and the global debate about fair use exemptions to copyright law that followed.

[2]

The Department concludes that free access to content – a “zero price license model” – is not appropriate because it “would undermine incentives for human creativity and could lead to long-term underproduction of human generated content.”

[3]

[4]

The Committee’s members also found “access to large volumes of data and high-quality data is crucial for AI development” but fears negotiations to license that content could lead to “long negotiations and high transaction costs [which] can hold back innovation, particularly for startups and MSMEs.”

It therefore proposed a hybrid model that has the following three elements:

AI developers receive a blanket license for the use of all lawfully accessed content for training purposes, without requiring individual negotiations;

Royalties become payable only upon commercialization of the AI tools, with rates set by a government appointed committee. The rates would be subject to judicial review.

A centralized mechanism handles royalty collection and distribution aiming to reduce transaction costs, provide legal certainty, and support equitable access for both large and small AI developers.

The paper even suggests a name for the royalty collection organization – The Copyright Royalties Collective for AI Training (CRCAT) – and recommends it be a nonprofit organized by associations of rightsholders. It also proposes the establishment of a “Works Database for AI training royalties” that would invite content creators to register their works in order to be eligible to receive royalties from CRCAT.

“By preserving the right of the copyright owners to receive royalties and administering it through a single umbrella organization made by the rightsholders and designated by the government, the model aims to provide an easy access to content for AI Developers for AI Training, simplify licensing procedures, reduce transaction costs, ensure fair compensation for rightsholders,” the paper states.

[5]India has satisfied its supercomputing needs, but not its ambitions

[6]India's government targets Uber, Ola with plan to launch zero-commission rideshare platform

[7]Lawyer's 6-year-old son uses AI to build copyright infringement generator

[8]UK judge delivers a 'damp squib' in Getty AI training case, no clear precedent set

Precedents exist for the proposed arrangement. Several countries operate performing rights organizations that collect royalties from venues such as restaurants and retailers that play recorded music. Those royalties are pooled and disbursed to artists. Your correspondent is a member of an Australian scheme that charges royalties for reprints of news and other content and disburses them to creators who register their works. *

India’s government has declared the nation will become a world leader in all aspects of AI, an ambition that sees it take a mostly friendly attitude towards tech giants as they address the local market. Tech giants, however, continue to argue fiercely for the right to train their models without first paying for content – but are also doing deals that cover their ongoing operations.

[9]

India, however, poses a considerable challenge because the nation recognizes 22 scheduled languages, eight of which are spoken by over 50 million people, and has a huge and fragmented media and publishing ecosystem. This proposal may therefore go down well with Big Tech, if New Delhi makes royalty payments worth their while. ®

* I end up with a couple of hundred dollars a year but other journos I know – mostly those who work in finance media – have told me they can score thousands a year.

Get our [10]Tech Resources



[1] https://www.dpiit.gov.in/static/uploads/2025/12/ff266bbeed10c48e3479c941484f3525.pdf

[2] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=2&c=2aTqkRnTX7jwD_MtPnvanYQAAAIA&t=ct%3Dns%26unitnum%3D2%26raptor%3Dcondor%26pos%3Dtop%26test%3D0

[3] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aTqkRnTX7jwD_MtPnvanYQAAAIA&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[4] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aTqkRnTX7jwD_MtPnvanYQAAAIA&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0

[5] https://www.theregister.com/2025/11/26/india_supercomputing_state_of/

[6] https://www.theregister.com/2025/12/04/bharat_taxi_india_challenges_uber/

[7] https://www.theregister.com/2025/12/03/ai_has_made_ip_violations/

[8] https://www.theregister.com/2025/11/04/uk_court_getty_stability_ai/

[9] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aTqkRnTX7jwD_MtPnvanYQAAAIA&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[10] https://whitepapers.theregister.com/



Royalties become payable only upon commercialization of the AI tools

heyrick

You grab it, you pay for it. Otherwise there's an obvious loophole where one company (that makes no revenue) does the data collection for another (that reckons it is worth billions and saves the good stuff for the paid tier).

Plus, this is very much an "after the horse has bolted" scenario.

Do we all have to register with all these schemes?

that one in the corner

> Your correspondent is a member of an Australian scheme that charges royalties for reprints of news and other content and disburses them to creators who register their works.

The LLM scrapers are indiscriminate and, as they don't track how much each of their billions of nadans was affected by each bit of input, can not possibly provide fair attribution for their final output*. So are all the humans going to be compensated for the scraping done *before* this scheme comes into place or is the compensation done by, ooh, percentage of the total material registered by this scheme that has your name on it?

NOTE: the Australian scheme mentioned is clearly aimed at explicit REPRINTS - that is, the item was found by some means (e.g. a search) and then explicitly taken as something to reprint, with the attribution clearly attached. Trying to apply that style of scheme to chat bot results has NOTHING whatsoever to do with LLM scraping of copyright works: it could be applied to the proactive bots that go out and pull another copy of the work, in which case it not an "AI" issue but instead we are back to the old "Google is showing too much of my text in their search results" problem.

I'm all for compensating the creators, btw; it is just that schemes like this all seem to be working from the outside in and don't seem to match reality. But they'll make *some* payouts, especially to all the big players who are already rich enough to pay the lawyers, and because of that will allow the LLM operators to be able to say "no, no, we are paying, everything is fair now" whilst all the poor young artists are still left starving in their garrets, producing genuinely great stuff that won't even be recognised after their deaths because it has been hoovered up by the spiders before anyone else and the serial numbers filed off before you can blink.

PS

If the answer is "yes, of course you have to register with all these schemes to get your due compensation" how much is THAT going to cost all the aspiring new authors? Even if these databases are free to enter (so how are they funded? By taking too much of your compensation?) there are going to lots of them - and nice people who'll be sure to get your work into all of them for a "reasonable" fee.

* even when paraphrasing, or just plain quoting, a recognisable chunk of text: if it spits out "Nobody expects the Spanish Inquisition" did it get that by reading Monty Python scripts or from ingesting El Reg comments over the years? What if the LLM is found to always mention vultures whenever it produces that quote? BTW, the use here of a trivially small quote is simply to avoid (1) having to make this comment stupidly long and (2) to try to get the reader to ponder how often even quite large chunks of material are found outside of their original piece without attribution "because we all know where it came from" even out of context; imagine how much of a work you can ingest just by reading a fan site** that assumes you know the context... So no responses like "that is too short to be worth paying for".

** now you've got the idea, replace "fan site" with "professional platform for discussing serious subject".

Anonymous Coward

"India’s government has declared the nation will become a world leader in all aspects of AI"

should read

"India’s government has declared the nation will become a world consumer in all aspects of AI"

having just been pulled into a meeting to solve a problem where, during screen sharing, all I could see was a so called 'SME' asking Co-Pilot what to do (and we're talking real basic stuff here). Not the first time I've seen this either.

Drivers are a more complex issue. I'm not opposed to binary only drivers,
providing its easy to tell they are there and dump all bug reports about them.
Freedom generally includes the right to give up freedom. I'll tell people its
a bad idea but once they get caught, well it was their right to do so...

- Alan Cox on linux-kernel