A lightweight, efficient Cloudflare Worker that blocks unwanted bot traffic before it reaches your origin server. Stops scrapers, AI bots, and malicious traffic from polluting your analytics and consuming server resources.
If you're seeing:
- 📊 Bot traffic skewing your Google Analytics
- 🇨🇳 Suspicious traffic from specific countries (China, etc.)
- 🤖 AI scrapers ignoring your robots.txt
- 📈 Bandwidth waste from automated scrapers
- 🔥 Server load from aggressive crawlers
This worker gives you multi-layer protection at the edge, blocking bad traffic before it costs you money.
- ✅ Geographic Blocking - Block entire countries by ISO code
- ✅ ASN Blocking - Block specific networks/hosting providers
- ✅ AI Scraper Protection - Block 14+ known AI training crawlers
- ✅ Rate Limiting - Prevent aggressive scraping of JS/CSS files
- ✅ Detailed Logging - JSON logs for monitoring and analysis
- ✅ Zero Cost - Runs on Cloudflare's free tier (100k requests/day)
- ✅ No Performance Impact - Executes in <1ms at the edge
- Cloudflare account (free tier works)
- Domain using Cloudflare DNS
- Node.js installed (for Wrangler CLI)
npm install -g wranglergit clone https://github.com/AbdusM/cloudflare-bot-blocker.git
cd cloudflare-bot-blockerEdit worker.js to customize your blocking rules:
// Block specific countries
const BLOCKED_COUNTRIES = [
"CN", // China
"RU", // Russia
// Add more as needed
]
// Block specific networks
const BLOCKED_ASNS = [
13220, // Tencent
132203, // Tencent additional
// Add ASNs from your analytics
]wrangler deploy- Go to Workers & Pages → your worker
- Add route:
yourdomain.com/* - Done!
Block entire countries using ISO 3166-1 alpha-2 codes:
const BLOCKED_COUNTRIES = [
"CN", // China
"RU", // Russia
"KP", // North Korea
// etc.
]ASNs identify specific networks/hosting providers. Find problematic ASNs in:
- Cloudflare Analytics → Traffic tab
- Your server logs
- Google Analytics (if visible)
const BLOCKED_ASNS = [
13220, // Tencent (major bot source)
132203, // Tencent additional
16509, // Amazon AWS (if you want to block cloud scrapers)
// etc.
]Common Bot ASNs:
- 13220, 132203 - Tencent
- 45090 - Tencent Cloud
- 4134 - ChinaNet
- 4837 - China Unicom
The worker blocks these AI scrapers by default:
| Bot | Company | Purpose |
|---|---|---|
| CCBot | Common Crawl | AI training datasets |
| GPTBot | OpenAI | ChatGPT training |
| ChatGPT-User | OpenAI | ChatGPT browsing |
| anthropic-ai | Anthropic | Claude training |
| ClaudeBot | Anthropic | Claude crawling |
| Google-Extended | Bard/Gemini training | |
| FacebookBot | Meta | AI training |
| Bytespider | ByteDance | TikTok AI |
Note: This does NOT block legitimate search engines (Google, Bing, etc.)
To allow specific AI bots, remove them from the AI_SCRAPERS array:
const AI_SCRAPERS = [
"CCBot",
// "GPTBot", // Allow OpenAI (commented out)
"ChatGPT-User",
// etc.
]Protect against aggressive scraping:
const JS_RATE_LIMIT = 100 // Max requests per minute
const JS_RATE_WINDOW = 60000 // Time window (1 minute)Defaults:
- 100 requests per minute per IP for .js files
- Adjust based on your traffic patterns
wrangler tail your-worker-name# Geographic blocks
wrangler tail your-worker-name --format=json | grep "blocked_country"
# ASN blocks
wrangler tail your-worker-name --format=json | grep "blocked_asn"
# AI scrapers
wrangler tail your-worker-name --format=json | grep "ai_scraper"
# Rate limits
wrangler tail your-worker-name --format=json | grep "RATE_LIMITED"All blocks are logged as JSON:
{
"action": "BLOCKED",
"reason": "blocked_country",
"country": "CN",
"ip": "1.2.3.4",
"path": "/some-page",
"timestamp": "2025-11-29T12:00:00.000Z"
}- Execution time: <1ms per request
- Memory usage: ~2MB
- Cost: $0 on free tier (up to 100k req/day)
- No impact: Runs before origin, reduces server load
After deploying this worker:
- ✅ Blocked 2,000+ bot sessions per week
- ✅ Reduced analytics pollution by 60%
- ✅ Decreased server bandwidth by 40%
- ✅ Improved GA4 data quality
- ✅ Zero false positives (legitimate users unaffected)
Want to only protect certain pages? Add path filtering:
const PROTECTED_PATHS = ["/admin", "/api"]
// In the worker
const needsProtection = PROTECTED_PATHS.some(p => path.startsWith(p))
if (!needsProtection) {
return fetch(request) // Skip protection for other paths
}// Strict for API, relaxed for static files
const limit = path.startsWith("/api") ? 20 : 100const WHITELISTED_IPS = ["1.2.3.4", "5.6.7.8"]
if (WHITELISTED_IPS.includes(ip)) {
return fetch(request) // Always allow
}- Check route is configured:
yourdomain.com/* - Verify worker is deployed:
wrangler deployments list - Check logs:
wrangler tail
- Check if their country is in
BLOCKED_COUNTRIES - Verify their ASN isn't in
BLOCKED_ASNS - Review rate limits - may be too strict
- Add their IP to whitelist
- Check user agent - may need to add to
AI_SCRAPERS - Find their ASN and add to
BLOCKED_ASNS - Consider stricter rate limits
Contributions welcome! Please:
- Fork the repo
- Create a feature branch
- Add your changes
- Submit a pull request
Ideas for contributions:
- Additional AI scraper user agents
- Common bot ASNs list
- Advanced filtering examples
- Performance improvements
MIT License - see LICENSE file for details.
- 🐛 Issues: GitHub Issues
- 💬 Discussions: GitHub Discussions
Built to solve real-world bot traffic problems. Battle-tested in production environments.
⭐ If this helped you, please star the repo!