Add VitePress documentation with GitHub Pages deployment

- Create docs/ directory with VitePress configuration - Add documentation for all web servers (Nginx, Apache, Traefik, HAProxy) - Add bad bot detection and API reference documentation - Add GitHub Actions workflow for automatic deployment to GitHub Pages - Configure VitePress with sidebar, navigation, and search
2025-12-29 16:15:12 +00:00 · 2025-12-09 08:07:06 +01:00 · 2025-12-09 08:07:06 +01:00 · ea474cbcf2
commit ea474cbcf2
parent 6bcca53eae
13 changed files with 3829 additions and 0 deletions
--- a/.github/workflows/docs.yml
+++ b/.github/workflows/docs.yml
@ -0,0 +1,62 @@
+name: Deploy Documentation
+
+on:
+  push:
+    branches:
+      - main
+    paths:
+      - 'docs/**'
+      - '.github/workflows/docs.yml'
+  workflow_dispatch:
+
+permissions:
+  contents: read
+  pages: write
+  id-token: write
+
+concurrency:
+  group: pages
+  cancel-in-progress: false
+
+jobs:
+  build:
+    runs-on: ubuntu-latest
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+
+      - name: Setup Node
+        uses: actions/setup-node@v4
+        with:
+          node-version: 20
+          cache: npm
+          cache-dependency-path: docs/package-lock.json
+
+      - name: Setup Pages
+        uses: actions/configure-pages@v4
+
+      - name: Install dependencies
+        run: npm ci
+        working-directory: docs
+
+      - name: Build with VitePress
+        run: npm run docs:build
+        working-directory: docs
+
+      - name: Upload artifact
+        uses: actions/upload-pages-artifact@v3
+        with:
+          path: docs/.vitepress/dist
+
+  deploy:
+    environment:
+      name: github-pages
+      url: ${{ steps.deployment.outputs.page_url }}
+    needs: build
+    runs-on: ubuntu-latest
+    steps:
+      - name: Deploy to GitHub Pages
+        id: deployment
+        uses: actions/deploy-pages@v4
--- a/docs/.gitignore
+++ b/docs/.gitignore
@ -0,0 +1,3 @@
+node_modules
+.vitepress/cache
+.vitepress/dist
--- a/docs/.vitepress/config.ts
+++ b/docs/.vitepress/config.ts
@ -0,0 +1,74 @@
+import { defineConfig } from 'vitepress'
+
+export default defineConfig({
+    title: 'Patterns',
+    description: 'OWASP CRS and Bad Bot Detection for Web Servers',
+    base: '/patterns/',
+
+    head: [
+        ['link', { rel: 'icon', href: '/patterns/favicon.ico' }]
+    ],
+
+    themeConfig: {
+        logo: '/logo.svg',
+
+        nav: [
+            { text: 'Home', link: '/' },
+            { text: 'Getting Started', link: '/getting-started' },
+            {
+                text: 'Web Servers',
+                items: [
+                    { text: 'Nginx', link: '/nginx' },
+                    { text: 'Apache', link: '/apache' },
+                    { text: 'Traefik', link: '/traefik' },
+                    { text: 'HAProxy', link: '/haproxy' }
+                ]
+            },
+            { text: 'Bad Bots', link: '/badbots' },
+            { text: 'API', link: '/api' }
+        ],
+
+        sidebar: [
+            {
+                text: 'Introduction',
+                items: [
+                    { text: 'Getting Started', link: '/getting-started' }
+                ]
+            },
+            {
+                text: 'Web Server Integration',
+                items: [
+                    { text: 'Nginx', link: '/nginx' },
+                    { text: 'Apache (ModSecurity)', link: '/apache' },
+                    { text: 'Traefik', link: '/traefik' },
+                    { text: 'HAProxy', link: '/haproxy' }
+                ]
+            },
+            {
+                text: 'Features',
+                items: [
+                    { text: 'Bad Bot Detection', link: '/badbots' },
+                    { text: 'API Reference', link: '/api' }
+                ]
+            }
+        ],
+
+        socialLinks: [
+            { icon: 'github', link: 'https://github.com/fabriziosalmi/patterns' }
+        ],
+
+        footer: {
+            message: 'Released under the MIT License.',
+            copyright: 'Copyright © 2024-present Fabrizio Salmi'
+        },
+
+        search: {
+            provider: 'local'
+        },
+
+        editLink: {
+            pattern: 'https://github.com/fabriziosalmi/patterns/edit/main/docs/:path',
+            text: 'Edit this page on GitHub'
+        }
+    }
+})
--- a/docs/apache.md
+++ b/docs/apache.md
@ -0,0 +1,162 @@
+# Apache Integration
+
+This guide explains how to integrate the WAF patterns with Apache using ModSecurity.
+
+## Prerequisites
+
+- Apache 2.4+
+- ModSecurity module installed
+
+### Install ModSecurity
+
+::: code-group
+
+```bash [Debian/Ubuntu]
+sudo apt install libapache2-mod-security2
+sudo a2enmod security2
+```
+
+```bash [RHEL/CentOS]
+sudo yum install mod_security
+```
+
+:::
+
+## Quick Start
+
+1. Download `apache_waf.zip` from [Releases](https://github.com/fabriziosalmi/patterns/releases)
+2. Extract to your Apache configuration directory
+3. Include the files in your Apache configuration
+
+## Configuration Files
+
+The Apache WAF package includes ModSecurity rules organized by attack type:
+
+| File | Protection Type |
+|------|-----------------|
+| `sqli.conf` | SQL Injection |
+| `xss.conf` | Cross-Site Scripting |
+| `rce.conf` | Remote Code Execution |
+| `lfi.conf` | Local File Inclusion |
+| `rfi.conf` | Remote File Inclusion |
+| `bots.conf` | Bad Bot Detection |
+
+## Integration
+
+### Step 1: Enable ModSecurity
+
+Create or edit `/etc/apache2/mods-enabled/security2.conf`:
+
+```apache
+<IfModule security2_module>
+    SecRuleEngine On
+    SecRequestBodyAccess On
+    SecResponseBodyAccess Off
+    SecDebugLogLevel 0
+</IfModule>
+```
+
+### Step 2: Include WAF Rules
+
+Add to your Apache configuration or virtual host:
+
+```apache
+<VirtualHost *:80>
+    ServerName example.com
+    
+    # Include all WAF patterns
+    Include /path/to/waf_patterns/apache/*.conf
+    
+    # ... other configurations ...
+</VirtualHost>
+```
+
+Or include specific rule sets:
+
+```apache
+Include /path/to/waf_patterns/apache/sqli.conf
+Include /path/to/waf_patterns/apache/xss.conf
+Include /path/to/waf_patterns/apache/bots.conf
+```
+
+### Step 3: Restart Apache
+
+```bash
+sudo apachectl configtest && sudo systemctl restart apache2
+```
+
+## Rule Format
+
+The rules follow ModSecurity syntax:
+
+```apache
+SecRule REQUEST_URI "@rx union.*select" \
+    "id:100001,\
+    phase:2,\
+    deny,\
+    status:403,\
+    msg:'SQL Injection Attempt',\
+    severity:CRITICAL"
+```
+
+## Customization
+
+### Adjust Severity Levels
+
+Modify the action from `deny` to `log` for monitoring mode:
+
+```apache
+SecRule REQUEST_URI "@rx pattern" \
+    "id:100001,\
+    phase:2,\
+    log,\
+    pass,\
+    msg:'Potential attack detected'"
+```
+
+### Whitelist Paths
+
+Add exceptions for specific paths:
+
+```apache
+SecRule REQUEST_URI "@beginsWith /api/webhook" \
+    "id:1,\
+    phase:1,\
+    allow,\
+    nolog"
+```
+
+## Logging
+
+ModSecurity logs are typically found at:
+- `/var/log/apache2/modsec_audit.log`
+- `/var/log/httpd/modsec_audit.log`
+
+Enable detailed logging:
+
+```apache
+SecAuditEngine RelevantOnly
+SecAuditLog /var/log/apache2/modsec_audit.log
+SecAuditLogParts ABCDEFHZ
+```
+
+## Testing
+
+```bash
+# Test SQL injection detection
+curl -I "http://example.com/?id=1' UNION SELECT * FROM users--"
+
+# Check Apache error log
+sudo tail -f /var/log/apache2/error.log
+```
+
+## Troubleshooting
+
+### ModSecurity not loading
+Ensure the module is enabled: `sudo a2enmod security2`
+
+### Rules not triggering
+Check that `SecRuleEngine` is set to `On` and rules are being included.
+
+### Performance issues
+Consider using `SecRuleRemoveById` to disable noisy rules that cause false positives.
--- a/docs/api.md
+++ b/docs/api.md
@ -0,0 +1,223 @@
+# API Reference
+
+This page documents the Python scripts that power the Patterns project.
+
+## Core Scripts
+
+### owasp2json.py
+
+Fetches and parses OWASP Core Rule Set patterns from GitHub.
+
+```bash
+python owasp2json.py
+```
+
+**Output**: `owasp_rules.json`
+
+**Configuration**:
+- Uses environment variable `OWASP_REPO` to specify source repository
+- Default: `coreruleset/coreruleset`
+
+**Features**:
+- Fetches latest CRS rules from GitHub
+- Parses `.conf` files for regex patterns
+- Extracts rule metadata (ID, severity, category)
+- Outputs structured JSON for conversion scripts
+
+---
+
+### json2nginx.py
+
+Converts OWASP JSON rules to Nginx WAF configuration.
+
+```bash
+python json2nginx.py
+```
+
+**Input**: `owasp_rules.json`  
+**Output**: `waf_patterns/nginx/`
+
+**Generated Files**:
+| File | Purpose |
+|------|---------|
+| `waf_maps.conf` | Map directives (http block) |
+| `waf_rules.conf` | If statements (server block) |
+| `README.md` | Integration instructions |
+
+**Environment Variables**:
+- `INPUT_FILE` - Path to OWASP JSON (default: `owasp_rules.json`)
+- `OUTPUT_DIR` - Output directory (default: `waf_patterns/nginx`)
+
+---
+
+### json2apache.py
+
+Converts OWASP JSON rules to Apache ModSecurity format.
+
+```bash
+python json2apache.py
+```
+
+**Input**: `owasp_rules.json`  
+**Output**: `waf_patterns/apache/`
+
+**Generated Files**:
+- Category-specific `.conf` files (sqli.conf, xss.conf, etc.)
+- Each file contains ModSecurity `SecRule` directives
+
+---
+
+### json2traefik.py
+
+Converts OWASP JSON rules to Traefik middleware configuration.
+
+```bash
+python json2traefik.py
+```
+
+**Input**: `owasp_rules.json`  
+**Output**: `waf_patterns/traefik/`
+
+**Generated Files**:
+- `middleware.toml` - Traefik middleware configuration
+- `README.md` - Integration instructions
+
+---
+
+### json2haproxy.py
+
+Converts OWASP JSON rules to HAProxy ACL format.
+
+```bash
+python json2haproxy.py
+```
+
+**Input**: `owasp_rules.json`  
+**Output**: `waf_patterns/haproxy/`
+
+**Generated Files**:
+- `waf.acl` - Main WAF ACL rules
+- `README.md` - Integration instructions
+
+---
+
+### badbots.py
+
+Generates bad bot blocking configurations from public bot lists.
+
+```bash
+python badbots.py
+```
+
+**Output**: Bot configurations in each `waf_patterns/*/` directory
+
+**Features**:
+- Fetches from multiple public bot lists
+- Includes fallback sources for reliability
+- Generates platform-specific configs
+
+---
+
+## Import Scripts
+
+These scripts help import existing WAF configurations.
+
+### import_nginx_waf.py
+
+Import Nginx WAF patterns from external sources.
+
+```bash
+python import_nginx_waf.py --source /path/to/external/rules
+```
+
+### import_apache_waf.py
+
+Import Apache ModSecurity rules.
+
+```bash
+python import_apache_waf.py --source /path/to/modsec/rules
+```
+
+### import_traefik_waf.py
+
+Import Traefik middleware configurations.
+
+```bash
+python import_traefik_waf.py --source /path/to/traefik/config
+```
+
+### import_haproxy_waf.py
+
+Import HAProxy ACL rules.
+
+```bash
+python import_haproxy_waf.py --source /path/to/haproxy/acl
+```
+
+---
+
+## Data Structures
+
+### owasp_rules.json Format
+
+```json
+[
+  {
+    "id": "942100",
+    "pattern": "(?i:union.*select)",
+    "category": "sqli",
+    "severity": "critical",
+    "location": "request-uri",
+    "description": "SQL Injection Attack Detected"
+  }
+]
+```
+
+**Fields**:
+| Field | Type | Description |
+|-------|------|-------------|
+| `id` | string | OWASP CRS rule ID |
+| `pattern` | string | Regex pattern |
+| `category` | string | Attack category (sqli, xss, rce, etc.) |
+| `severity` | string | critical, high, medium, low |
+| `location` | string | Where to match (request-uri, headers, etc.) |
+| `description` | string | Human-readable description |
+
+---
+
+## Extending the Project
+
+### Adding a New Platform
+
+1. Create `json2<platform>.py` based on existing converters
+2. Add output directory in `waf_patterns/<platform>/`
+3. Update GitHub Actions workflow
+4. Add documentation in `docs/`
+
+### Custom Pattern Sources
+
+Modify `owasp2json.py` to add new pattern sources:
+
+```python
+SOURCES = [
+    "coreruleset/coreruleset",
+    "your-org/your-rules",
+]
+```
+
+---
+
+## Dependencies
+
+Listed in `requirements.txt`:
+
+```
+requests>=2.28.0
+beautifulsoup4>=4.11.0
+```
+
+Install with:
+
+```bash
+pip install -r requirements.txt
+```
--- a/docs/badbots.md
+++ b/docs/badbots.md
@ -0,0 +1,191 @@
+# Bad Bot Detection
+
+This guide explains how to use the bad bot detection feature to block malicious crawlers and scrapers.
+
+## Overview
+
+The `badbots.py` script generates configuration files to block known malicious bots based on their User-Agent strings. It fetches bot lists from multiple public sources and generates blocking rules for each supported web server.
+
+## How It Works
+
+1. Fetches bot lists from public sources:
+   - [ai.robots.txt](https://github.com/ai-robots-txt/ai.robots.txt)
+   - Various community-maintained bot lists
+2. Generates blocking configurations for each platform
+3. Updates configurations daily via GitHub Actions
+
+## Generated Files
+
+| Platform | File | Format |
+|----------|------|--------|
+| Nginx | `bots.conf` | Map directive |
+| Apache | `bots.conf` | ModSecurity rules |
+| Traefik | `bots.toml` | Middleware config |
+| HAProxy | `bots.acl` | ACL patterns |
+
+## Nginx Bot Blocker
+
+The Nginx configuration uses a map directive:
+
+```nginx
+# In http block
+map $http_user_agent $bad_bot {
+    default 0;
+    "~*AhrefsBot" 1;
+    "~*SemrushBot" 1;
+    "~*MJ12bot" 1;
+    "~*DotBot" 1;
+    # ... more bots
+}
+
+# In server block
+if ($bad_bot) {
+    return 403;
+}
+```
+
+### Integration
+
+```nginx
+http {
+    include /path/to/waf_patterns/nginx/bots.conf;
+    
+    server {
+        if ($bad_bot) {
+            return 403;
+        }
+    }
+}
+```
+
+## Apache Bot Blocker
+
+Uses ModSecurity rules:
+
+```apache
+SecRule REQUEST_HEADERS:User-Agent "@rx AhrefsBot" \
+    "id:200001,phase:1,deny,status:403,msg:'Bad Bot Blocked'"
+```
+
+## HAProxy Bot Blocker
+
+Uses ACL rules:
+
+```haproxy
+acl bad_bot hdr_reg(User-Agent) -i -f /etc/haproxy/bots.acl
+http-request deny if bad_bot
+```
+
+## Blocked Bot Categories
+
+The following categories of bots are blocked by default:
+
+### SEO/Marketing Crawlers
+- AhrefsBot
+- SemrushBot
+- MJ12bot
+- DotBot
+- BLEXBot
+
+### AI/ML Crawlers
+- GPTBot
+- ChatGPT-User
+- CCBot
+- Google-Extended
+- Anthropic-AI
+
+### Scrapers
+- DataForSeoBot
+- PetalBot
+- Bytespider
+- ClaudeBot
+
+### Malicious Bots
+- Known vulnerability scanners
+- Spam bots
+- Content scrapers
+
+## Customization
+
+### Add Custom Bots
+
+Edit the generated file or add your own patterns:
+
+```nginx
+# Nginx: Add to bots.conf
+"~*MyCustomBot" 1;
+```
+
+```apache
+# Apache: Add rule
+SecRule REQUEST_HEADERS:User-Agent "@rx MyCustomBot" \
+    "id:200999,deny"
+```
+
+### Whitelist Bots
+
+For Nginx, allow specific bots:
+
+```nginx
+map $http_user_agent $bad_bot {
+    default 0;
+    "~*Googlebot" 0;     # Allow Google
+    "~*AhrefsBot" 1;     # Block Ahrefs
+}
+```
+
+### Allow All Bots for Specific Paths
+
+```nginx
+location /public-api {
+    # Override bot blocking
+    if ($bad_bot) {
+        # Don't block here
+    }
+}
+```
+
+## Generate Manually
+
+Run the script to regenerate bot lists:
+
+```bash
+python badbots.py
+```
+
+The script supports fallback lists if primary sources are unavailable.
+
+## Monitoring
+
+### Log Blocked Bots
+
+Enable logging to track blocked requests:
+
+```nginx
+if ($bad_bot) {
+    access_log /var/log/nginx/blocked_bots.log;
+    return 403;
+}
+```
+
+### Analyze Bot Traffic
+
+```bash
+# Count blocked bot requests
+grep "403" /var/log/nginx/access.log | \
+  awk '{print $12}' | sort | uniq -c | sort -rn | head -20
+```
+
+## Best Practices
+
+1. **Regular Updates**: The bot lists are updated daily. Pull the latest changes or download from releases.
+
+2. **Monitor False Positives**: Some legitimate services may use blocked User-Agents. Monitor your logs.
+
+3. **Combine with Rate Limiting**: Use bot blocking with rate limiting for comprehensive protection.
+
+4. **Test Before Deploying**: Verify that legitimate traffic (search engines, monitoring) is not blocked.
+
+::: warning
+Blocking search engine bots (Googlebot, Bingbot) can negatively impact SEO. The default lists do **not** block major search engines.
+:::
--- a/docs/getting-started.md
+++ b/docs/getting-started.md
@ -0,0 +1,78 @@
+# Getting Started
+
+This guide will help you get up and running with Patterns WAF configurations for your web server.
+
+## Prerequisites
+
+- **Python 3.11+** (if building from source)
+- **pip** (Python package installer)
+- **git** (for cloning the repository)
+
+## Installation Options
+
+### Option 1: Download Pre-Generated Configurations
+
+The easiest way to get started is to download pre-built configurations:
+
+1. Go to the [Releases](https://github.com/fabriziosalmi/patterns/releases) page
+2. Download the ZIP file for your web server:
+   - `nginx_waf.zip` - Nginx configurations
+   - `apache_waf.zip` - Apache ModSecurity rules
+   - `traefik_waf.zip` - Traefik middleware
+   - `haproxy_waf.zip` - HAProxy ACL files
+3. Extract and integrate into your server configuration
+
+### Option 2: Build from Source
+
+If you prefer to generate the configurations yourself:
+
+```bash
+# Clone the repository
+git clone https://github.com/fabriziosalmi/patterns.git
+cd patterns
+
+# Install dependencies
+pip install -r requirements.txt
+
+# Fetch latest OWASP rules
+python owasp2json.py
+
+# Generate configurations for your platform
+python json2nginx.py    # For Nginx
+python json2apache.py   # For Apache
+python json2traefik.py  # For Traefik
+python json2haproxy.py  # For HAProxy
+
+# Generate bad bot blockers
+python badbots.py
+```
+
+## Configuration Files
+
+After running the scripts, you'll find the generated files in the `waf_patterns/` directory:
+
+```
+waf_patterns/
+├── nginx/          # Nginx WAF configs
+├── apache/         # Apache ModSecurity rules
+├── traefik/        # Traefik middleware configs
+└── haproxy/        # HAProxy ACL files
+```
+
+## Next Steps
+
+Choose your web server to learn how to integrate the WAF configurations:
+
+- [Nginx Integration](/nginx)
+- [Apache Integration](/apache)
+- [Traefik Integration](/traefik)
+- [HAProxy Integration](/haproxy)
+
+## Automatic Updates
+
+The repository includes a GitHub Actions workflow that:
+- Fetches the latest OWASP CRS rules **daily**
+- Regenerates all WAF configurations
+- Creates a new release with updated files
+
+To get the latest rules, simply download from the [Releases](https://github.com/fabriziosalmi/patterns/releases) page or pull the latest changes if you cloned the repository.
--- a/docs/haproxy.md
+++ b/docs/haproxy.md
@ -0,0 +1,192 @@
+# HAProxy Integration
+
+This guide explains how to integrate the WAF patterns with HAProxy using ACL rules.
+
+## Quick Start
+
+1. Download `haproxy_waf.zip` from [Releases](https://github.com/fabriziosalmi/patterns/releases)
+2. Extract the files
+3. Include the ACL files in your HAProxy configuration
+
+## Configuration Files
+
+The HAProxy WAF package includes:
+
+| File | Purpose |
+|------|---------|
+| `waf.acl` | Main WAF ACL rules |
+| `bots.acl` | Bad bot detection ACLs |
+
+## Integration
+
+### Step 1: Include ACL Files
+
+In your `haproxy.cfg`, include the WAF ACL files:
+
+```haproxy
+frontend http-in
+    bind *:80
+    
+    # Include WAF ACL rules
+    acl waf_block_sqli path_reg -i union.*select
+    acl waf_block_sqli path_reg -i insert.*into
+    acl waf_block_xss path_reg -i <script>
+    
+    # Or include from external file
+    # acl waf_patterns path_reg -i -f /etc/haproxy/waf.acl
+    
+    # Block matching requests
+    http-request deny if waf_block_sqli
+    http-request deny if waf_block_xss
+    
+    default_backend servers
+```
+
+### Step 2: Include Bot Blockers
+
+```haproxy
+frontend http-in
+    bind *:80
+    
+    # Bad bot detection
+    acl bad_bot hdr_reg(User-Agent) -i -f /etc/haproxy/bots.acl
+    http-request deny if bad_bot
+    
+    default_backend servers
+```
+
+### Step 3: Reload HAProxy
+
+```bash
+haproxy -c -f /etc/haproxy/haproxy.cfg && sudo systemctl reload haproxy
+```
+
+## ACL Rule Format
+
+HAProxy ACLs use pattern matching on various request attributes:
+
+```haproxy
+# Match path
+acl sqli_path path_reg -i union.*select
+
+# Match query string
+acl sqli_query url_param(id) -m reg -i union.*select
+
+# Match headers
+acl bad_referer hdr_reg(Referer) -i malicious-site\.com
+
+# Combined conditions
+http-request deny if sqli_path OR sqli_query
+```
+
+## Complete Example
+
+```haproxy
+global
+    log /dev/log local0
+    maxconn 4096
+
+defaults
+    mode http
+    log global
+    option httplog
+    timeout connect 5s
+    timeout client 50s
+    timeout server 50s
+
+frontend http-in
+    bind *:80
+    
+    # WAF Rules
+    acl waf_sqli path_reg -i (union.*select|insert.*into|delete.*from)
+    acl waf_xss path_reg -i (<script|javascript:|on\w+\s*=)
+    acl waf_lfi path_reg -i (\.\.\/|\.\.\\)
+    acl waf_rce path_reg -i (;|\||`|\$\()
+    
+    # Bot blocking
+    acl bad_bot hdr_reg(User-Agent) -i (AhrefsBot|SemrushBot|MJ12bot)
+    
+    # Deny malicious requests
+    http-request deny deny_status 403 if waf_sqli
+    http-request deny deny_status 403 if waf_xss
+    http-request deny deny_status 403 if waf_lfi
+    http-request deny deny_status 403 if waf_rce
+    http-request deny deny_status 403 if bad_bot
+    
+    default_backend servers
+
+backend servers
+    balance roundrobin
+    server server1 127.0.0.1:8080 check
+```
+
+## Customization
+
+### Custom Error Pages
+
+Return a custom error page for blocked requests:
+
+```haproxy
+http-request deny deny_status 403 content-type text/html \
+    string "Access Denied" if waf_sqli
+```
+
+### Logging Blocked Requests
+
+Create a dedicated log for WAF blocks:
+
+```haproxy
+frontend http-in
+    # Log blocked requests
+    http-request set-var(txn.blocked) str(1) if waf_sqli
+    http-request capture var(txn.blocked) len 1
+    
+    # Custom log format
+    log-format "%ci:%cp [%t] %ft %b/%s %Tq/%Tw/%Tc/%Tr/%Tt %ST %B %CC %CS %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq blocked=%[var(txn.blocked)]"
+```
+
+### Whitelist Paths
+
+Skip WAF for specific paths:
+
+```haproxy
+acl is_api path_beg /api/webhook
+http-request deny if waf_sqli !is_api
+```
+
+## Rate Limiting
+
+Combine WAF with rate limiting:
+
+```haproxy
+# Stick table for rate limiting
+stick-table type ip size 100k expire 30s store http_req_rate(10s)
+http-request track-sc0 src
+acl too_many_requests sc_http_req_rate(0) gt 100
+
+http-request deny if too_many_requests
+```
+
+## Testing
+
+```bash
+# Test SQL injection detection
+curl -I "http://example.com/?id=1' UNION SELECT * FROM users--"
+
+# Test bot blocking
+curl -A "AhrefsBot" -I "http://example.com/"
+
+# Check HAProxy stats
+echo "show stat" | socat stdio /var/run/haproxy.sock
+```
+
+## Troubleshooting
+
+### ACLs not matching
+Use `haproxy -c -f haproxy.cfg` to validate syntax. Enable debug logging to see ACL evaluation.
+
+### Performance impact
+ACL evaluation is fast, but complex regex patterns can add latency. Test with realistic traffic.
+
+### Configuration too large
+HAProxy has limits on configuration size. Consider splitting large ACL lists into multiple files.
--- a/docs/index.md
+++ b/docs/index.md
@ -0,0 +1,63 @@
+---
+layout: home
+
+hero:
+  name: Patterns
+  text: OWASP WAF Rules for Web Servers
+  tagline: Automated OWASP CRS patterns and Bad Bot detection for Nginx, Apache, Traefik, and HAProxy
+  image:
+    src: /shield.svg
+    alt: Patterns
+  actions:
+    - theme: brand
+      text: Get Started
+      link: /getting-started
+    - theme: alt
+      text: View on GitHub
+      link: https://github.com/fabriziosalmi/patterns
+
+features:
+  - icon: 🛡️
+    title: OWASP CRS Protection
+    details: Leverages OWASP Core Rule Set for web application firewall defense against SQLi, XSS, RCE, and LFI attacks.
+  - icon: 🤖
+    title: Bad Bot Blocking
+    details: Blocks known malicious bots and scrapers using regularly updated public bot lists.
+  - icon: ⚙️
+    title: Multi-Server Support
+    details: Generates WAF configs for Nginx, Apache, Traefik, and HAProxy with consistent protection across platforms.
+  - icon: 🔄
+    title: Daily Updates
+    details: GitHub Actions automatically fetch new OWASP rules daily and push updated configurations.
+  - icon: 📦
+    title: Pre-Generated Configs
+    details: Download ready-to-use WAF configurations from GitHub Releases without building from source.
+  - icon: 🧩
+    title: Extensible Design
+    details: Modular architecture makes it easy to extend support to other web servers or load balancers.
+---
+
+## Quick Start
+
+Download the latest configurations from [GitHub Releases](https://github.com/fabriziosalmi/patterns/releases) or build from source:
+
+```bash
+git clone https://github.com/fabriziosalmi/patterns.git
+cd patterns
+pip install -r requirements.txt
+python owasp2json.py
+python json2nginx.py  # or json2apache.py, json2traefik.py, json2haproxy.py
+```
+
+## Supported Platforms
+
+| Platform | Config Format | Documentation |
+|----------|---------------|---------------|
+| **Nginx** | `.conf` files | [Read more →](/nginx) |
+| **Apache** | ModSecurity rules | [Read more →](/apache) |
+| **Traefik** | Middleware TOML | [Read more →](/traefik) |
+| **HAProxy** | ACL files | [Read more →](/haproxy) |
+
+::: tip Using Caddy?
+Check out the [caddy-waf](https://github.com/fabriziosalmi/caddy-waf) project for Caddy-specific WAF support.
+:::
--- a/docs/nginx.md
+++ b/docs/nginx.md
@ -0,0 +1,131 @@
+# Nginx Integration
+
+This guide explains how to integrate the WAF patterns into your Nginx configuration.
+
+## Quick Start
+
+1. Download `nginx_waf.zip` from [Releases](https://github.com/fabriziosalmi/patterns/releases)
+2. Extract to your Nginx configuration directory
+3. Include the configuration files as shown below
+
+## Configuration Files
+
+The Nginx WAF package includes:
+
+| File | Purpose | Include Location |
+|------|---------|------------------|
+| `waf_maps.conf` | Map directives for pattern matching | `http` block |
+| `waf_rules.conf` | If statements for blocking | `server` block |
+| `bots.conf` | Bad bot detection maps | `http` block |
+
+## Integration
+
+### Step 1: Include Maps in HTTP Block
+
+The map directives **must** be included in the `http` context:
+
+```nginx
+http {
+    # Include WAF maps (pattern definitions)
+    include /path/to/waf_patterns/nginx/waf_maps.conf;
+    
+    # Include bot detection maps
+    include /path/to/waf_patterns/nginx/bots.conf;
+    
+    # ... other http configurations ...
+}
+```
+
+### Step 2: Include Rules in Server Block
+
+The blocking rules go inside your `server` or `location` block:
+
+```nginx
+server {
+    listen 80;
+    server_name example.com;
+    
+    # Include WAF rules
+    include /path/to/waf_patterns/nginx/waf_rules.conf;
+    
+    # ... other server configurations ...
+}
+```
+
+### Step 3: Reload Nginx
+
+Test and reload the configuration:
+
+```bash
+sudo nginx -t && sudo systemctl reload nginx
+```
+
+## How It Works
+
+The WAF uses Nginx's `map` directive for efficient pattern matching:
+
+```nginx
+map $request_uri $waf_block_sqli {
+    default 0;
+    "~*union.*select" 1;
+    "~*insert.*into" 1;
+}
+
+if ($waf_block_sqli) {
+    return 403;
+}
+```
+
+## Customization
+
+### Enable Logging
+
+To log blocked requests, edit `waf_rules.conf` and uncomment the logging lines:
+
+```nginx
+if ($waf_block_sqli) {
+    return 403;
+    access_log /var/log/nginx/waf_blocked.log;
+}
+```
+
+### Whitelist Specific Paths
+
+Add exceptions before the WAF rules:
+
+```nginx
+location /api/webhook {
+    # Skip WAF for this path
+    # ... your configuration ...
+}
+
+# WAF rules for other paths
+include /path/to/waf_patterns/nginx/waf_rules.conf;
+```
+
+::: warning Important
+Individual category files like `attack.conf` or `xss.conf` should **not** be included directly. They contain both `map` and `if` directives which cannot be used in the same context. Always use `waf_maps.conf` + `waf_rules.conf`.
+:::
+
+## Testing
+
+Test your WAF configuration with common attack patterns:
+
+```bash
+# Should be blocked (SQL injection)
+curl -I "http://example.com/?id=1' OR '1'='1"
+
+# Should be blocked (XSS)
+curl -I "http://example.com/?q=<script>alert(1)</script>"
+```
+
+## Troubleshooting
+
+### Configuration errors
+Always run `nginx -t` before reloading to catch syntax errors.
+
+### False positives
+If legitimate requests are being blocked, check `/var/log/nginx/error.log` and consider adding path-specific exceptions.
+
+### Performance
+The map-based approach is highly efficient. For high-traffic sites, consider enabling caching for the map variables.
--- a/docs/package-lock.json
+++ b/docs/package-lock.json
--- a/docs/package.json
+++ b/docs/package.json
@ -0,0 +1,14 @@
+{
+    "name": "patterns-docs",
+    "version": "1.0.0",
+    "private": true,
+    "type": "module",
+    "scripts": {
+        "docs:dev": "vitepress dev",
+        "docs:build": "vitepress build",
+        "docs:preview": "vitepress preview"
+    },
+    "devDependencies": {
+        "vitepress": "^1.5.0"
+    }
+}
--- a/docs/traefik.md
+++ b/docs/traefik.md
@ -0,0 +1,168 @@
+# Traefik Integration
+
+This guide explains how to integrate the WAF patterns with Traefik using middleware plugins.
+
+## Quick Start
+
+1. Download `traefik_waf.zip` from [Releases](https://github.com/fabriziosalmi/patterns/releases)
+2. Extract the files
+3. Configure the middleware in your Traefik configuration
+
+## Configuration Files
+
+The Traefik WAF package includes:
+
+| File | Purpose |
+|------|---------|
+| `middleware.toml` | WAF middleware configuration |
+| `bots.toml` | Bad bot detection rules |
+
+## Integration with File Provider
+
+### Step 1: Enable File Provider
+
+In your `traefik.toml` or `traefik.yml`:
+
+::: code-group
+
+```toml [traefik.toml]
+[providers]
+  [providers.file]
+    directory = "/etc/traefik/dynamic"
+    watch = true
+```
+
+```yaml [traefik.yml]
+providers:
+  file:
+    directory: /etc/traefik/dynamic
+    watch: true
+```
+
+:::
+
+### Step 2: Copy Middleware Files
+
+Copy the WAF configuration files to your dynamic configuration directory:
+
+```bash
+cp waf_patterns/traefik/*.toml /etc/traefik/dynamic/
+```
+
+### Step 3: Apply Middleware to Routes
+
+Reference the middleware in your router configuration:
+
+::: code-group
+
+```toml [dynamic/routes.toml]
+[http.routers.my-router]
+  rule = "Host(`example.com`)"
+  service = "my-service"
+  middlewares = ["waf-protection", "bot-blocker"]
+
+[http.middlewares.waf-protection.plugin.waf]
+  # WAF configuration loaded from middleware.toml
+
+[http.middlewares.bot-blocker.plugin.botblocker]
+  # Bot blocking loaded from bots.toml
+```
+
+```yaml [dynamic/routes.yml]
+http:
+  routers:
+    my-router:
+      rule: "Host(`example.com`)"
+      service: my-service
+      middlewares:
+        - waf-protection
+        - bot-blocker
+```
+
+:::
+
+## Integration with Docker Labels
+
+For Docker-based deployments:
+
+```yaml
+services:
+  my-app:
+    image: my-app:latest
+    labels:
+      - "traefik.enable=true"
+      - "traefik.http.routers.my-app.rule=Host(`example.com`)"
+      - "traefik.http.routers.my-app.middlewares=waf@file"
+```
+
+## Middleware Configuration
+
+The `middleware.toml` contains regex-based blocking rules:
+
+```toml
+[http.middlewares.waf.plugin.rewriteHeaders]
+  # SQL Injection patterns
+  [[http.middlewares.waf.plugin.rewriteHeaders.replacements]]
+    regex = "(?i)union.*select"
+    replacement = "BLOCKED"
+```
+
+## Using with Traefik Plugins
+
+For enhanced WAF capabilities, consider using community plugins:
+
+```yaml
+experimental:
+  plugins:
+    waf:
+      moduleName: "github.com/example/traefik-waf-plugin"
+      version: "v1.0.0"
+```
+
+## Customization
+
+### Add Custom Patterns
+
+Edit `middleware.toml` to add your own patterns:
+
+```toml
+[[http.middlewares.waf.plugin.rewriteHeaders.replacements]]
+  regex = "your-custom-pattern"
+  replacement = "BLOCKED"
+```
+
+### Logging
+
+Enable access logs to monitor blocked requests:
+
+```toml
+[accessLog]
+  filePath = "/var/log/traefik/access.log"
+  format = "json"
+  
+  [accessLog.fields]
+    [accessLog.fields.headers]
+      defaultMode = "keep"
+```
+
+## Testing
+
+```bash
+# Test WAF detection
+curl -H "Host: example.com" \
+  "http://localhost/?id=1' OR '1'='1"
+
+# Check Traefik logs
+docker logs traefik 2>&1 | grep -i blocked
+```
+
+## Troubleshooting
+
+### Middleware not loading
+Check that the file provider is correctly configured and watching the right directory.
+
+### Routes not applying middleware
+Ensure the middleware name matches exactly between router and middleware definition.
+
+### Performance considerations
+Traefik's regex-based middleware can impact performance at high traffic. Monitor latency after enabling WAF rules.