chore: block AI scrapers in robots.txt

Block known AI training bots (GPTBot, ClaudeBot, CCBot, etc.) from crawling the site. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-19 14:59:41 +01:00
parent 1ed4cb4663
commit 6bddf61c04
1 changed files with 21 additions and 4 deletions
--- a/robots.txt
+++ b/robots.txt
@@ -1,5 +1,22 @@
-User-agent: *
-Disallow:
+# There is no search benefit to any AI models scraping sites - all they do is steal content for
+# their own profit, attribution free, which leads to them serving our content without ever sending
+# users to us.
+# Reference: https://neil-clarke.com/block-the-bots-that-feed-ai-models-by-scraping-your-website/
+# See: https://github.com/MattWilcox/native-base/blob/45f6e7a837104f5ad83a5c7e280fb9a4eb126219/robots.txt

-# Add additional rules as needed
-# Example: Disallow: /private/
+User-agent: CCBot
+User-agent: ChatGPT-User
+User-agent: GPTBot
+User-agent: Google-Extended
+User-agent: Omgilibot
+User-agent: Omgili
+User-agent: FacebookBot
+User-agent: Applebot-Extended
+User-agent: anthropic-ai
+User-agent: ClaudeBot
+User-agent: Diffbot
+User-agent: Bytespider
+User-agent: ImagesiftBot
+User-agent: PerplexityBot
+User-agent: cohere-ai
+Disallow: /