05.08.2025 01:04

Cloudflare Exposes Perplexity’s Shady Indexing Tactics After Customer Complaints

News image

Cloudflare has conducted an investigation into Perplexity following customer complaints, uncovering that the AI search engine is flouting long-established internet indexing standards.

Even when websites explicitly prohibit scanning via robots.txt and block Perplexity’s official bots, the company persists in scraping content using crawlers disguised as regular Chrome browsers.

The technique is a two-step process: first, Perplexity deploys its official PerplexityBot. If blocked, it switches to Plan B, adopting the user agent "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)" to mimic a standard browser. To further evade detection, Perplexity rotates IPs across various networks and even autonomous systems, making its activities harder to trace.

This blatant violation of the Robots Exclusion Protocol, outlined in RFC 9309 and considered a cornerstone of web ethics, marks a bold departure from industry norms.

Historically, companies caught breaching these standards have faced public backlash, issued apologies, and ceased the practice. Yet Perplexity appears determined to embrace the role of the enfant terrible among AI and search platforms.


Also read:

In response, Cloudflare has delisted Perplexity from its verified bot list and added blocking rules, now available to all customers, including those on free plans. However, as Cloudflare notes in its announcement on August 4, 2025, at 11:59 PM CEST, this exposure will likely prompt Perplexity to adapt its tactics, kicking off a new round of cat-and-mouse between the two. The internet’s ethical boundaries are once again being tested.


0 comments
Read more