Cloudflare has conducted an investigation into Perplexity following customer complaints, uncovering that the AI search engine is flouting long-established internet indexing standards.
Even when websites explicitly prohibit scanning via robots.txt and block Perplexity’s official bots, the company persists in scraping content using crawlers disguised as regular Chrome browsers.
The technique is a two-step process: first, Perplexity deploys its official PerplexityBot. If blocked, it switches to Plan B, adopting the user agent "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)" to mimic a standard browser. To further evade detection, Perplexity rotates IPs across various networks and even autonomous systems, making its activities harder to trace.
This blatant violation of the Robots Exclusion Protocol, outlined in RFC 9309 and considered a cornerstone of web ethics, marks a bold departure from industry norms.
Historically, companies caught breaching these standards have faced public backlash, issued apologies, and ceased the practice. Yet Perplexity appears determined to embrace the role of the enfant terrible among AI and search platforms.
Also read:
- Samsung Galaxy Users in the US Can Score a Free Year of Perplexity Pro
- American Cinemas Teeter on the Edge as Regal Pins Hopes on Dude Perfect Documentary
- Qwen-Image: A New Open-Source 20B MMDiT Model for Image Generation
- What Is an Amazon OTP Text?
In response, Cloudflare has delisted Perplexity from its verified bot list and added blocking rules, now available to all customers, including those on free plans. However, as Cloudflare notes in its announcement on August 4, 2025, at 11:59 PM CEST, this exposure will likely prompt Perplexity to adapt its tactics, kicking off a new round of cat-and-mouse between the two. The internet’s ethical boundaries are once again being tested.

