Perplexity vexed by Cloudflare's claims its bots are bad
- Reference: 1754425924
- News link: https://www.theregister.co.uk/2025/08/05/perplexity_vexed_by_cloudflares_claims/
- Source link:
"This controversy reveals that Cloudflare's systems are fundamentally inadequate for distinguishing between legitimate AI assistants and actual threats," Perplexity said in [1]a social media post published on Monday afternoon. "If you can't tell a helpful digital assistant from a malicious scraper, then you probably shouldn't be making decisions about what constitutes legitimate web traffic."
The dispute started earlier on Monday, when Cloudflare published a post claiming that [2]Perplexity has been disguising its web crawling bots by altering the user-agent identifier and by using unexpected IP address ranges to evade web application firewall blocking. It did not suggest the bots were malicious, merely that they were obfuscating their identity to avoid being blocked.
[3]
Cloudflare CEO Matthew Prince didn't use the term "malicious" either but came closer.
[4]
[5]
"Some supposedly 'reputable' AI companies act more like North Korean hackers," Prince [6]said on social media in reference to his company's post. "Time to name, shame, and hard block them."
Cloudflare, which provides network infrastructure services like hosting and security, argues that web publishers should be able to control the mechanisms used to access their content.
[7]
The issue is that automated page visits, whether from Perplexity or another AI service, don't generate ad impressions and revenue for publishers (assuming ad fraud systems function properly) .
The rise of AI crawlers that answer search queries by summarizing content culled from websites without compensation has led to [8]a decline in search engine traffic referrals to websites and has thrown the web's dominant business model into question.
Perplexity's position is that because a human user entered a search query, the Perplexity bot fetching that information from a publisher's website should be treated as a human visitor rather than an automaton.
[9]JetBrains previews Kineto for vibe no-coding
[10]Chained bugs in Nvidia's Triton Inference Server lead to full system compromise
[11]Uncle Sam floats tracking tech to keep AI chips out of China
[12]Google agrees to pause AI workloads to protect the grid when power demand spikes
To support that claim, Perplexity argues that, when Google's search engine crawls a page to include in its web index, that's different from when Google Search fetches a webpage and presents [13]a preview of the content .
"When companies like Cloudflare mischaracterize user-driven AI assistants as malicious bots, they're arguing that any automated tool serving users should be suspect – a position that would criminalize email clients and web browsers, or any other service a would-be gatekeeper decided they don’t like," Perplexity said.
[14]
That's a disingenuous comparison, however. A search engine doesn't intend for a thumbnail image or text snippet to be a complete substitute for visiting the web page. If Perplexity's response answers a search query and obviates the need to visit the source webpage to obtain that answer, that's clearly a different scenario.
This is not the first time Perplexity has come under fire for bot behavior. In early June last year, Forbes accused the company of [15]ripping off news content , an allegation subsequently [16]investigated by AWS for terms of service compliance. Shortly thereafter, blogger and podcaster Robb Knight [17]accused Perplexity of lying about its user agent , an allegation supported by [18]a report from Wired.
Kingsley Uyi Idehen, founder and CEO of OpenLink Software, an AI-oriented middleware business, pushed back against Perplexity's claims.
Citing the company's contention that its bot is acting to fulfill a human user's query, he [19]said in response, "This lack of clarity around who is acting and on whose behalf isn’t a technical footnote – it’s a foundational gap at the center of growing concerns about LLM-based tools and AI agents."
Identity, authenticity, and accountability continue to be important for online interaction, Idehen argues. So if Perplexity is obfuscating its bots, as Cloudflare has claimed, that's a problem.
Craig DeWitt, founder of SkyFire, a payment network for AI-based services, told The Register in an interview that both Perplexity and Cloudflare are right in their own way, though he disagreed with Prince comparing Perplexity to North Korea.
"What Perplexity is right about is that the nature of the internet has changed with these AI interfaces," DeWitt said, citing [20]similar observations from Vercel CEO Guillermo Rauch that the internet must adapt to AI.
"The problem now is like it's no longer human directly to website, it's human through an intermediary to website," DeWitt said. "And the problem for the websites – this is talking on Cloudflare's side, now – is that the monetization model for a lot of these, in terms of servicing ads, goes away."
Websites, he said, do not want to deliver their content without attribution through AI service interfaces. But, he said, collaboration is the key because if websites start blocking everything, that hurts everyone.
Cloudflare and Perplexity did not respond to requests for comment. ®
Get our [21]Tech Resources
[1] https://x.com/perplexity_ai/status/1952531537385456019
[2] https://www.theregister.com/2025/08/04/perplexity_ai_crawlers_accused_data_raids/
[3] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=2&c=2aJJ-6NJAbqbT_UXxyh5RkwAAAJA&t=ct%3Dns%26unitnum%3D2%26raptor%3Dcondor%26pos%3Dtop%26test%3D0
[4] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aJJ-6NJAbqbT_UXxyh5RkwAAAJA&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0
[5] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aJJ-6NJAbqbT_UXxyh5RkwAAAJA&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0
[6] https://x.com/eastdakota/status/1952379571527193017
[7] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aJJ-6NJAbqbT_UXxyh5RkwAAAJA&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0
[8] https://www.theregister.com/2025/06/22/ai_search_starves_publishers/
[9] https://www.theregister.com/2025/08/05/jetbrains_kineto/
[10] https://www.theregister.com/2025/08/05/nvidia_triton_bug_chain/
[11] https://www.theregister.com/2025/08/05/us_ai_chip_tracking/
[12] https://www.theregister.com/2025/08/04/google_ai_datacenter_grid/
[13] https://developers.google.com/search/blog/2019/09/more-controls-on-search
[14] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aJJ-6NJAbqbT_UXxyh5RkwAAAJA&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0
[15] https://www.forbes.com/sites/sarahemerson/2024/06/07/buzzy-ai-search-engine-perplexity-is-directly-ripping-off-content-from-news-outlets/
[16] https://apnews.com/article/amazon-perplexity-online-content-scraping-investigation-eb1d6bc1caf3dfe0c97db6f4fc691bc9
[17] https://rknight.me/blog/perplexity-ai-is-lying-about-its-user-agent/
[18] https://www.wired.com/story/perplexity-is-a-bullshit-machine/
[19] https://x.com/kidehen/status/1952720256331391027
[20] https://x.com/rauchg/status/1952483023431291340
[21] https://whitepapers.theregister.com/
"If you can't tell a helpful digital assistant from a malicious scraper"
Then you should ban them all and Perplexity can go fuck itself.
Or Perplexity proves how it is not bad by showing its code.
Not gonna happen.
Re: "If you can't tell a helpful digital assistant from a malicious scraper"
100% agree.
But, he said, collaboration is the key because if websites start blocking everything, that hurts everyone.
Too bad. Block all AI if that's what the website owner wants, they owe the AI scrapers NOTHING contrary to what the AI companies want to believe in order to enable their own greedy goals.
Re: "If you can't tell a helpful digital assistant from a malicious scraper"
It doesn't hurt everyone anyway, it hurts AI providers. And they're not even trying to collaborate, they just call anyone who doesn't give them everything they want a luddite.
Pot......Meet Kettle.........
(1) Perplexity scrapes web site
(2) .....after web site data aggregator has scraped EVERYONE else's data.
Hard to choose really!!!
Are we about to find out....
....that it's Perplexity who sub'd out AI scraping to a bunch of set top boxes and TV sticks in the developing world to do their AI crawling - while using false user agents and pretending to be Firefox 68 on Windows CE coming from 100 different IP ranges from twelve different countries at a rate of hundreds of hits per second, per site?
https://www.mythic-beasts.com/blog/2025/04/01/abusive-ai-web-crawlers-get-off-my-lawn/
Which is why an entire new class of bot-stopping software has been developed in just the last four months?
Honestly, given Perplexity's previous behaviour, I'd not be shocked.
Steven R
Re: Are we about to find out....
Or free VPNs like Hola which offer everyone's residential IP to everyone else [1]and can be co-opted by botnets or maybe even allow paid-for access to IPs on their VPN to commercial customers.
[1] https://www.theregister.com/2015/05/29/hola_vpn_used_8chan_takedown_botnet_or_not/?page=2
I have an idea. Let Cloudflare pay to have Mandiant (or similar) examine Perplexities change logs from the date of Cloudflare's original announcement of this 'feature'. I'm 99% positive that Perplexity would balk at this idea.