News: 0176929105

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Wikimedia Drowning in AI Bot Traffic as Crawlers Consume 65% of Resources

(Friday April 04, 2025 @11:30PM (msmash) from the closer-look dept.)


Web crawlers collecting training data for AI models are [1]overwhelming Wikipedia's infrastructure , with bot traffic growing exponentially since early 2024, according to the Wikimedia Foundation. According to data released April 1, bandwidth for multimedia content has surged 50% since January, primarily from automated programs scraping Wikimedia Commons' 144 million openly licensed media files.

This unprecedented traffic is causing operational challenges for the non-profit. When Jimmy Carter [2]died in December 2024 , his Wikipedia page received 2.8 million views in a day, while a 1.5-hour video of his 1980 presidential debate caused network traffic to double, resulting in slow page loads for some users.

Analysis shows 65% of the foundation's most resource-intensive traffic comes from bots, despite bots accounting for only 35% of total pageviews. The foundation's Site Reliability team now routinely blocks overwhelming crawler traffic to prevent service disruptions. "Our content is free, our infrastructure is not," the foundation said, announcing plans to establish sustainable boundaries for automated content consumption.



[1] https://diff.wikimedia.org/2025/04/01/how-crawlers-impact-the-operations-of-the-wikimedia-projects/

[2] https://news.slashdot.org/story/24/12/30/0251249/when-jimmy-carter-spoke-at-a-wireless-tradeshow



AI is the most gluttonous & antisocial thing (Score:2)

by rsilvergun ( 571051 )

I think the human race is ever come up with. It devours everything and it spits out garbage. With the idea being that we're all going to be forced to use it. I'm sure that the technology has its uses in scientific fields but to the general consumer there is nothing but downsides. It's the end stage of the enshitification of the internet maybe even the whole kit and kaboodle of our civilization..

And we can do absolutely nothing to stop it because you're not allowed to question the unending growth of cor

Re: (Score:2)

by Mr. Dollar Ton ( 5495648 )

If you could spell "ppl" correctly, you'd know it means "people", not AI bots, you victim of the modern "AI" ejucation.

Dono (Score:3)

by dohzer ( 867770 )

Great. I'm about to get 400 extra popups and emails from Jimmy Wales asking for me to donate to Wikipedia now, aren't I?

Points to the end of the open internet (Score:2)

by Big Hairy Gorilla ( 9839972 )

To all you chaps who argue their is no difference between you reading a book and OpenAI scraping everything .

The speed and scale that AI information harvesting is done at is the difference between you reading a book and applying the knowledge , and AI renting that knowledge back to for profit. Wikipedia subsidizes info harvesting, so they will have to close the door, or go out of biz.

Re: (Score:2)

by Mr. Dollar Ton ( 5495648 )

Yes, yet another example of the "free market" and its capability of "self-regulating".

The invisible hand showing you the invisible finger before shoving it up your ass.

Paid for by your tax money.

Re: (Score:2)

by OngelooflijkHaribo ( 7706194 )

Yes, so they have automated something. At one point a man walked up with a screwdriver to screw in a car wheel, at this point, a machine does it far faster but fundamentally it's the same procedure.

Man has managed to automate something yet again, that he may sit on his arse one extra hour per day and work less, for he enjoys sitting on his arse more than working in dangerous factories, I can't blame him.

Sell hard drives (Score:1)

by davidwr ( 791652 )

Sell hard drives full of the most-requested-by-bot traffic "at cost."

I say "at cost" to avoid the possible scandal/volunteer-boycott of "Wikimedia making money off of content."

Alternatively, set up a deal with a content-delivery-network where the content delivery network would charge a fee to the actual bot-masters to cover its costs, with Wikimedia gaining nothing but a reduced traffic load in exchange.

Commercial AI-bot-masters would very likely be willing to pay a reasonable fee to avoid the Wikimedia-imp

Nevermind Re:Sell hard drives (Score:1)

by davidwr ( 791652 )

It looks like the Foundation has this covered with their database-dump mechanism and mirrors run by outside volunteers.

Can someone explain to me... (Score:3)

by HotNeedleOfInquiry ( 598897 )

Why you would crawl the on-line copy of Wikipedia when you can download an image of it and crawl it locally?

Re: (Score:2)

by Mr. Dollar Ton ( 5495648 )

For same reason that would you think brute force will help you build "AI".

More efficient? (Score:2)

by OngelooflijkHaribo ( 7706194 )

This makes me wonder how big the revolution will be once they find some way to perform similar training but with far less input. Some kind of new revolutionary model that can achieve the same while reading far less. It has to be doable because right now artificial neural networks need vastly more input to be able to achieve far less reasoning skills as humans do from their inputs so maybe it's possible, though to be fair, humans also need far more time to process their input to make it useful.

Would be inter

poor jimmy (Score:1)

by muntjac ( 805565 )

this has to be a precursor for jibby wales to beg for money. (even tho wikipedia is funded easily already)

Download is only 105GB (Score:2)

by ihadafivedigituid ( 8391795 )

Why crawl? I have a local copy of Wikipedia running in [1]Kiwix [kiwix.org]. There must be torrents out there with the same export, though the stupid AI people would probably not seed.

[1] https://kiwix.org/

They could just ... (Score:2)

by PPH ( 736903 )

... learn a lesson from the time Wikipedia was being vandalized. Direct all detected AI bots to the article about [1]chickens [theregister.com].

[1] https://www.theregister.com/2006/11/09/wikipedia_chicken_controversy/

Angry White Men Getting Even More Angry (Score:2)

by thesjaakspoiler ( 4782965 )

Not only having to deal with people editing their pages, but now bots reading their beloved pages as well.

DE: The Soviets seem to have difficulty implementing modern technology.
Would you comment on that?

Belenko: Well, let's talk about aircraft engine lifetime. When I flew the
MiG-25, its engines had a total lifetime of 250 hours.

DE: Is that mean-time-between-failure?

Belenko: No, the engine is finished; it is scrapped.

DE: You mean they pull it out and throw it away, not even overhauling it?

Belenko: That is correct. Overhaul is too expensive.

DE: That is absurdly low by free world standards.

Belenko: I know.
-- an interview with Victor Belenko, MiG-25 fighter pilot who defected
in 1976 "Defense Electronics", Vol 20, No. 6, pg. 102