Quotidien Shaarli

Tous les liens d'un jour sur une page.

January 25, 2025

RSS Radio France pour tous

Le site pour rétablir les flux RSS de Radio France

GitHub - ai-robots-txt/ai.robots.txt: A list of AI agents and robots to block.
thumbnail

A list of AI agents and robots to block. Contribute to ai-robots-txt/ai.robots.txt development by creating an account on GitHub.

Pour ma part, j'ai rajouté cela dans le conf-enabled/security.conf de mon apache :

# https://raw.githubusercontent.com/ai-robots-txt/ai.robots.txt/refs/heads/main/.htaccess
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^.*(AI2Bot|Ai2Bot-Dolma|Amazonbot|anthropic-ai|Applebot|Applebot-Extended|Bytespider|CCBot|ChatGPT-User|Claude-Web|ClaudeBot|cohere-ai|cohere-training-data-crawler|Crawlspace|Diffbot|DuckAssistBot|FacebookBot|FriendlyCrawler|Google-Extended|GoogleOther|GoogleOther-Image|GoogleOther-Video|GPTBot|iaskspider/2.0|ICC-Crawler|ImagesiftBot|img2dataset|ISSCyberRiskCrawler|Kangaroo\ Bot|Meta-ExternalAgent|Meta-ExternalFetcher|OAI-SearchBot|omgili|omgilibot|PanguBot|PerplexityBot|PetalBot|Scrapy|SemrushBot|Sidetrade\ indexer\ bot|Timpibot|VelenPublicWebCrawler|Webzio-Extended|YouBot).*$ [NC]
RewriteRule .* - [F,L]
Migadu Email

email provider

cobalt

cobalt lets you save what you love without ads, tracking, paywalls or other nonsense. just paste the link and you're ready to rock!