Study Finds a Third of New Websites Are AI-Generated
- Reference: 0183000468
- News link: https://tech.slashdot.org/story/26/04/27/2123224/study-finds-a-third-of-new-websites-are-ai-generated
- Source link:
> Researchers working with data from the Internet Archive have [2]discovered that a third of websites created since 2022 are AI-generated . The team of researchers -- which includes people from Stanford, the Imperial College London, and the Internet Archive -- published their findings online in a paper titled " [3]The Impact of AI-Generated Text on the Internet ." The research also found that all this AI-generated text is making the web more cheery and less verbose.
"The proliferation of AI-generated and AI-assisted text on the internet is feared to contribute to a degradation in semantic and stylistic diversity, factual accuracy, and other negative developments," the researchers write in the paper. "We find that by mid-2025, roughly 35% of newly published websites were classified as AI-generated or AI-assisted, up from zero before ChatGPT's launch in late 2022."
"I find the sheer speed of the AI takeover of the web quite staggering," Jonas Dolezal, an AI researcher at Stanford and co-author of the paper, told 404 Media. "After decades of humans shaping it, a significant portion of the internet has become defined by AI in just three years. We're witnessing, in my opinion, a major transformation of the digital landscape in a fraction of the time it took to build in the first place."
Maty Bohacek, a student researcher at Stanford and one of the co-authors of the paper, added: "As AI-generated content spreads, the challenge is finding a role for these models that doesn't just result in a sanitized, repetitive web," he said. "Rather than forcing models to be perfectly compliant and agreeable, allowing them to have a more distinct personality or 'friction' might help them act as a creative partner rather than a replacement for human voice."
[1] https://slashdot.org/~alternative_right
[2] https://www.404media.co/study-finds-a-third-of-new-websites-are-ai-generated/
[3] https://ai-on-the-internet.github.io/?ref=404media.co
Re: (Score:3)
"Just as bad when more than a 1/3 of the web was WordPress or whatever the CMS du jour is." You maybe under estimating the shear volume of AI slop being generated today.
Re: (Score:2)
Yeah, there's a lot of AI slop and it's problem. But I am not going to pretend that most web content has ever been artisanal and well curated. Well maybe back in the day when my GeoCities page for my dog was part of the most well regarded Webring. At least then Tom was still my friend.
Re: (Score:2)
I remember building a website in Frontpage 2000... and editing it in HTML (and hosting the website on a 200MHz Packard Bell with 32Mb of RAM).
I learned coding the hard way... no classes or anything. By the time I got my graphing calculator (TI-83+), I flipped through the handbook and realized I already knew the code (BASICA, with math stuff tossed in), so I already knew it's programming.
Small blogs by anyone don't add up to anything, neither do niche creators.
If someone well-known by everyone (in the viewi
Re: (Score:2)
That entirely depends on the relative value, even assuming that everyone who views it will assess it using similar mechanisms.
The academic blog scene is still a thing, for example, but it's predictably only relevant to people who know each other by name and communicate about their blogs face to face at conferences.
Re: (Score:2)
Um, that's not my argument. It's that when looking at "per site" statistics the internet has always been a lot of low effort content. For every specialist or niche creator website there's been a site for an abandoned hot dog stand in Toledo, an astrology for pets microblog, abandoned instance of Mastodon for squirrel breeders, or fanfic journal with tenuous grasp on how normal human relationships function.
Elevating good content has always been hard and the era of search engines obscured how much of the web
Re:Same as it ever was (Score:4, Informative)
Wordpress is a framework for publishing web content. It's not really relevant here. You can publish slop using AI too. And the fact a website is Wordpress based does not mean the content is good or bad.
This is idiotic snobbery, and you should know better.
Re: (Score:2)
Of course using WordPress or even GeoCities doesn't mean the content is instantly bad, just easy to deploy. But that's the same for sites that include being AI assisted a technical blog post by a bilingual person using an LLM to fix improper idioms use. Knowing that 48% was just one framework which enables the publishing of web content of which a significant percentage is low effort or even human slop contextualizes the meaning of per site statistics.
There's many wonderful WordPress sites, there were some g
Re: (Score:2)
So you're sayin... barriers to entry being lowered results in increased access to create slop?
There's some merit to that point.
So? Does someone have a problem with progress? (Score:2)
Somebody is writing things as if they expected something different to happen.
Yeah I am sure there are still people out there that hand code web sites.
Probably many more that use template-based tools that have been developed over years and they are familiar with them. Many of those.
For myself, if I started a new web site now I would use AI to do it because it is a better tool than any other. It is just a tool. You can get good or bad results from it like any other tool, depending on how skilled you
Re: (Score:2)
Did you even read even the summary? I know we're on Slashdot though, so ... fair enough. It's not about hand-coding websites, it's about the *content*. Imagine that somebody wants to share something with the world via the internet. Now, this sharing is at 66% efficiency, with the added bonus that the content will be stolen, rehashed and added to the already crowded competition for user attention. It's a bit grim, imo.
Re: (Score:2)
Pirating content is one thing, generating random bullshit is something else.
Re: (Score:2)
Good? cut out the middle man. we can chat with a bot ALL the time and stop surfing the web. why bother to fake websites and fake loaded search results? just have the bot do everything directly! add in some randomized tone/attitude by topic for variety like multiple sources do... conspiracy crazy people and idiot big mouths can be replaced with hallucinations, possibly with a lower occurrence.
new chat. repeat. foobar.
My mistake (Score:2)
I thought people choosing to hide the truth (Fox News, Moms for Liberty [to restrict others' education]) would cause the idiocracy. It seems the idiocracy will come from corporations using AI to echo sane-washed propaganda to other AIs.
maybe a browser extension (Score:2)
Like ublock origin will add something to block the AI generated websites, I ran across a few myself and noticed it too after opening a link only to be presented with clickbait content
Re: (Score:2)
uBlock Origin itself could do it. Just need someone to make a list that we can subscribe to that blocks all of the slop sites.
Up from zero? (Score:1)
We had ChatGPT 2 and other ways to generate sloppy websites before 2022.
Re: (Score:2)
Not to mention the affiliate marketing SEO spammers had their own little AI-based respinners going back a little further once the simple string substitution respinners were getting clobbered in google search results.
Summary suggestion isn't great (Score:2)
"Rather than forcing models to be perfectly compliant and agreeable, allowing them to have a more distinct personality or 'friction' might help them act as a creative partner rather than a replacement for human voice."
It would also make AI more difficult for humans to detect. Being able to spot AI is the reason that key parts of the internet is still usable and human trust in it hasn't been broken yet.
half my browsing is AI driven (Score:2)
If you don't feel like writing things, then I'm not going to read them myself. We will plug AI into AI as a big circular human centipede oroboros.
I guess us humans will have to do things that don't involve mass media consumption.
R.I.P. late stage capitalism
Re: (Score:2)
Ambrose Bierce would have been proud to create your post. You cleverly expose both evil and weakness. The mentally/socially disabled who depend on or profit from *,ai consider you a major threat, crashing your KARMA vindictively.
Search results dominated by AI slop (Score:2)
The problem is not that there are a lot of AI slop pages being generated every day. The problem is the search results you get back from google are increasingly dominated by these AI slop pages. It's becoming difficult to avoid them and the misinformation they spew.
Re: (Score:2)
Go on Google News and start clicking links to stories. A growing fraction of them now read less like articles and more like a list of bullet points. I don't know for sure they're AI generated, but I suspect a lot are. This isn't random webpages. These are commercial sites that a few years ago would have been considered reputable news sources.
GIGO (Score:4, Insightful)
So AI will be training itself on stuff generated by itself, the ultimate self licking ice cream cone. Reminds me of the game of putting a paragraph repeatedly through translation software back and forth between to languages until you got gibberish,
Re: (Score:2)
If they were all this funny I wouldn't complain.
[1]https://m.youtube.com/watch?v=... [youtube.com]
[1] https://m.youtube.com/watch?v=TLr1_vjdTgs&pp=ygUlVHdpc3RlZCB0cmFuc2xhdGlvbnMgcGxhZ3VlcyBvZiBlZ3lwdA%3D%3D&ra=m
This is a race to the bottom (Score:3)
The damage that it's going to do the Internet, and to society, and to education, government, and all the other components of society, is staggering. An enormous amount of work done by dedicated people over decades will be swamped by the flood of AI slop, and I don't think we'll know what we've lost until it's gone.
Many readers of this site are likely familiar with various sci-fi stories that deal with nanobots which have begun reproducing without limit, eventually consuming all resources and reducing their planet to "gray goo". This is the information equivalent: it will expand to occupy everything that it possibly can, overwhelming everything generated by humans. And when that happens, it will impact our shared view of reality, which is based on a (mostly) common set of facts.
And when nothing is real, anything can be real. This will not escape the attention of would-be fascists and dictators.
"Less Verbose"? (Score:2)
I find the repetitive, flowery crap that claims to be a website today to be quite useless. Multiple sites on a subject have carbon copied content (or at least lead-ins). It is the quintessential enshittification of the web.
Curious what this brings us next...
So F*$king Slow (Score:2)
Is this why websites have all gotten so slow? They all need to use at least 100 libraries just to put up a blog page.
Re: (Score:1)
Not all.
I edit my business's webpage with vim. It's plain html and has only two graphics (stored in the same directory).
And I get people complimenting me on its design, which I find amazing.
Re: (Score:1)
Me too, well FreeBSD so I use vi. Besides my website is for me.
it's dead, jim (Score:2)
The www. The open web is finished. The www is for logged in services only now. The things you have to do, booking air tickets, banking, taxes.
This has got to be a joke (Score:2)
More than half of my search results are AI generated trash. Half the time I just go to the AI answer and then its sources, because they're more likely to be on target. The web is quite fucked, and its only getting worse.
What is the purpose of it all ? (Score:2)
Hosting a web site costs money, using an AI to do something costs money. What is the reward for spending this money ? I have a feeling that most of it is not what one would call good. Reasons that come to mind:
* Spreading a political/similar message or confusing someone else's message
* Generate clickbait to earn money from Google/...
* Persuade people to buy something which might be real or a scam
There must be more that I cannot think of now. What do you think ?
Might be to late already! (Score:4, Insightful)
The Internet is being buried with AI generated slop being created, indexed, summarized and regurgitated as even more AI slop to be consumed by AI bots to generate even more AI slop. With anything really creative, innovative, informative and true being the needle in the proverbial haystack and effectively hidden.
Re: Might be to late already! (Score:3)
My website was king for 10 years. Now it's buried under 50 rudimentary AI clones churning out blog posts about how their software is better, whilst they blanket Reddit with new accounts promoting themselves and spreading misinformation about mine. Reddit seems to ban a lot but not all. Impossible to compete. Still have my core users but they're dwindling. Was a fun run.
Re: Might be to late already! (Score:1)
As someone who has been repeatedly banned by humans, can I cry you a river? When you blamed me for driving away users, did you see AI coming?