mirror of
https://github.com/fabriziosalmi/patterns.git
synced 2025-12-17 09:45:34 +00:00
Add VitePress documentation with GitHub Pages deployment
- Create docs/ directory with VitePress configuration - Add documentation for all web servers (Nginx, Apache, Traefik, HAProxy) - Add bad bot detection and API reference documentation - Add GitHub Actions workflow for automatic deployment to GitHub Pages - Configure VitePress with sidebar, navigation, and search
This commit is contained in:
parent
6bcca53eae
commit
ea474cbcf2
62
.github/workflows/docs.yml
vendored
Normal file
62
.github/workflows/docs.yml
vendored
Normal file
@ -0,0 +1,62 @@
|
||||
name: Deploy Documentation
|
||||
|
||||
on:
|
||||
push:
|
||||
branches:
|
||||
- main
|
||||
paths:
|
||||
- 'docs/**'
|
||||
- '.github/workflows/docs.yml'
|
||||
workflow_dispatch:
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
pages: write
|
||||
id-token: write
|
||||
|
||||
concurrency:
|
||||
group: pages
|
||||
cancel-in-progress: false
|
||||
|
||||
jobs:
|
||||
build:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v4
|
||||
with:
|
||||
fetch-depth: 0
|
||||
|
||||
- name: Setup Node
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version: 20
|
||||
cache: npm
|
||||
cache-dependency-path: docs/package-lock.json
|
||||
|
||||
- name: Setup Pages
|
||||
uses: actions/configure-pages@v4
|
||||
|
||||
- name: Install dependencies
|
||||
run: npm ci
|
||||
working-directory: docs
|
||||
|
||||
- name: Build with VitePress
|
||||
run: npm run docs:build
|
||||
working-directory: docs
|
||||
|
||||
- name: Upload artifact
|
||||
uses: actions/upload-pages-artifact@v3
|
||||
with:
|
||||
path: docs/.vitepress/dist
|
||||
|
||||
deploy:
|
||||
environment:
|
||||
name: github-pages
|
||||
url: ${{ steps.deployment.outputs.page_url }}
|
||||
needs: build
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Deploy to GitHub Pages
|
||||
id: deployment
|
||||
uses: actions/deploy-pages@v4
|
||||
3
docs/.gitignore
vendored
Normal file
3
docs/.gitignore
vendored
Normal file
@ -0,0 +1,3 @@
|
||||
node_modules
|
||||
.vitepress/cache
|
||||
.vitepress/dist
|
||||
74
docs/.vitepress/config.ts
Normal file
74
docs/.vitepress/config.ts
Normal file
@ -0,0 +1,74 @@
|
||||
import { defineConfig } from 'vitepress'
|
||||
|
||||
export default defineConfig({
|
||||
title: 'Patterns',
|
||||
description: 'OWASP CRS and Bad Bot Detection for Web Servers',
|
||||
base: '/patterns/',
|
||||
|
||||
head: [
|
||||
['link', { rel: 'icon', href: '/patterns/favicon.ico' }]
|
||||
],
|
||||
|
||||
themeConfig: {
|
||||
logo: '/logo.svg',
|
||||
|
||||
nav: [
|
||||
{ text: 'Home', link: '/' },
|
||||
{ text: 'Getting Started', link: '/getting-started' },
|
||||
{
|
||||
text: 'Web Servers',
|
||||
items: [
|
||||
{ text: 'Nginx', link: '/nginx' },
|
||||
{ text: 'Apache', link: '/apache' },
|
||||
{ text: 'Traefik', link: '/traefik' },
|
||||
{ text: 'HAProxy', link: '/haproxy' }
|
||||
]
|
||||
},
|
||||
{ text: 'Bad Bots', link: '/badbots' },
|
||||
{ text: 'API', link: '/api' }
|
||||
],
|
||||
|
||||
sidebar: [
|
||||
{
|
||||
text: 'Introduction',
|
||||
items: [
|
||||
{ text: 'Getting Started', link: '/getting-started' }
|
||||
]
|
||||
},
|
||||
{
|
||||
text: 'Web Server Integration',
|
||||
items: [
|
||||
{ text: 'Nginx', link: '/nginx' },
|
||||
{ text: 'Apache (ModSecurity)', link: '/apache' },
|
||||
{ text: 'Traefik', link: '/traefik' },
|
||||
{ text: 'HAProxy', link: '/haproxy' }
|
||||
]
|
||||
},
|
||||
{
|
||||
text: 'Features',
|
||||
items: [
|
||||
{ text: 'Bad Bot Detection', link: '/badbots' },
|
||||
{ text: 'API Reference', link: '/api' }
|
||||
]
|
||||
}
|
||||
],
|
||||
|
||||
socialLinks: [
|
||||
{ icon: 'github', link: 'https://github.com/fabriziosalmi/patterns' }
|
||||
],
|
||||
|
||||
footer: {
|
||||
message: 'Released under the MIT License.',
|
||||
copyright: 'Copyright © 2024-present Fabrizio Salmi'
|
||||
},
|
||||
|
||||
search: {
|
||||
provider: 'local'
|
||||
},
|
||||
|
||||
editLink: {
|
||||
pattern: 'https://github.com/fabriziosalmi/patterns/edit/main/docs/:path',
|
||||
text: 'Edit this page on GitHub'
|
||||
}
|
||||
}
|
||||
})
|
||||
162
docs/apache.md
Normal file
162
docs/apache.md
Normal file
@ -0,0 +1,162 @@
|
||||
# Apache Integration
|
||||
|
||||
This guide explains how to integrate the WAF patterns with Apache using ModSecurity.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Apache 2.4+
|
||||
- ModSecurity module installed
|
||||
|
||||
### Install ModSecurity
|
||||
|
||||
::: code-group
|
||||
|
||||
```bash [Debian/Ubuntu]
|
||||
sudo apt install libapache2-mod-security2
|
||||
sudo a2enmod security2
|
||||
```
|
||||
|
||||
```bash [RHEL/CentOS]
|
||||
sudo yum install mod_security
|
||||
```
|
||||
|
||||
:::
|
||||
|
||||
## Quick Start
|
||||
|
||||
1. Download `apache_waf.zip` from [Releases](https://github.com/fabriziosalmi/patterns/releases)
|
||||
2. Extract to your Apache configuration directory
|
||||
3. Include the files in your Apache configuration
|
||||
|
||||
## Configuration Files
|
||||
|
||||
The Apache WAF package includes ModSecurity rules organized by attack type:
|
||||
|
||||
| File | Protection Type |
|
||||
|------|-----------------|
|
||||
| `sqli.conf` | SQL Injection |
|
||||
| `xss.conf` | Cross-Site Scripting |
|
||||
| `rce.conf` | Remote Code Execution |
|
||||
| `lfi.conf` | Local File Inclusion |
|
||||
| `rfi.conf` | Remote File Inclusion |
|
||||
| `bots.conf` | Bad Bot Detection |
|
||||
|
||||
## Integration
|
||||
|
||||
### Step 1: Enable ModSecurity
|
||||
|
||||
Create or edit `/etc/apache2/mods-enabled/security2.conf`:
|
||||
|
||||
```apache
|
||||
<IfModule security2_module>
|
||||
SecRuleEngine On
|
||||
SecRequestBodyAccess On
|
||||
SecResponseBodyAccess Off
|
||||
SecDebugLogLevel 0
|
||||
</IfModule>
|
||||
```
|
||||
|
||||
### Step 2: Include WAF Rules
|
||||
|
||||
Add to your Apache configuration or virtual host:
|
||||
|
||||
```apache
|
||||
<VirtualHost *:80>
|
||||
ServerName example.com
|
||||
|
||||
# Include all WAF patterns
|
||||
Include /path/to/waf_patterns/apache/*.conf
|
||||
|
||||
# ... other configurations ...
|
||||
</VirtualHost>
|
||||
```
|
||||
|
||||
Or include specific rule sets:
|
||||
|
||||
```apache
|
||||
Include /path/to/waf_patterns/apache/sqli.conf
|
||||
Include /path/to/waf_patterns/apache/xss.conf
|
||||
Include /path/to/waf_patterns/apache/bots.conf
|
||||
```
|
||||
|
||||
### Step 3: Restart Apache
|
||||
|
||||
```bash
|
||||
sudo apachectl configtest && sudo systemctl restart apache2
|
||||
```
|
||||
|
||||
## Rule Format
|
||||
|
||||
The rules follow ModSecurity syntax:
|
||||
|
||||
```apache
|
||||
SecRule REQUEST_URI "@rx union.*select" \
|
||||
"id:100001,\
|
||||
phase:2,\
|
||||
deny,\
|
||||
status:403,\
|
||||
msg:'SQL Injection Attempt',\
|
||||
severity:CRITICAL"
|
||||
```
|
||||
|
||||
## Customization
|
||||
|
||||
### Adjust Severity Levels
|
||||
|
||||
Modify the action from `deny` to `log` for monitoring mode:
|
||||
|
||||
```apache
|
||||
SecRule REQUEST_URI "@rx pattern" \
|
||||
"id:100001,\
|
||||
phase:2,\
|
||||
log,\
|
||||
pass,\
|
||||
msg:'Potential attack detected'"
|
||||
```
|
||||
|
||||
### Whitelist Paths
|
||||
|
||||
Add exceptions for specific paths:
|
||||
|
||||
```apache
|
||||
SecRule REQUEST_URI "@beginsWith /api/webhook" \
|
||||
"id:1,\
|
||||
phase:1,\
|
||||
allow,\
|
||||
nolog"
|
||||
```
|
||||
|
||||
## Logging
|
||||
|
||||
ModSecurity logs are typically found at:
|
||||
- `/var/log/apache2/modsec_audit.log`
|
||||
- `/var/log/httpd/modsec_audit.log`
|
||||
|
||||
Enable detailed logging:
|
||||
|
||||
```apache
|
||||
SecAuditEngine RelevantOnly
|
||||
SecAuditLog /var/log/apache2/modsec_audit.log
|
||||
SecAuditLogParts ABCDEFHZ
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
```bash
|
||||
# Test SQL injection detection
|
||||
curl -I "http://example.com/?id=1' UNION SELECT * FROM users--"
|
||||
|
||||
# Check Apache error log
|
||||
sudo tail -f /var/log/apache2/error.log
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### ModSecurity not loading
|
||||
Ensure the module is enabled: `sudo a2enmod security2`
|
||||
|
||||
### Rules not triggering
|
||||
Check that `SecRuleEngine` is set to `On` and rules are being included.
|
||||
|
||||
### Performance issues
|
||||
Consider using `SecRuleRemoveById` to disable noisy rules that cause false positives.
|
||||
223
docs/api.md
Normal file
223
docs/api.md
Normal file
@ -0,0 +1,223 @@
|
||||
# API Reference
|
||||
|
||||
This page documents the Python scripts that power the Patterns project.
|
||||
|
||||
## Core Scripts
|
||||
|
||||
### owasp2json.py
|
||||
|
||||
Fetches and parses OWASP Core Rule Set patterns from GitHub.
|
||||
|
||||
```bash
|
||||
python owasp2json.py
|
||||
```
|
||||
|
||||
**Output**: `owasp_rules.json`
|
||||
|
||||
**Configuration**:
|
||||
- Uses environment variable `OWASP_REPO` to specify source repository
|
||||
- Default: `coreruleset/coreruleset`
|
||||
|
||||
**Features**:
|
||||
- Fetches latest CRS rules from GitHub
|
||||
- Parses `.conf` files for regex patterns
|
||||
- Extracts rule metadata (ID, severity, category)
|
||||
- Outputs structured JSON for conversion scripts
|
||||
|
||||
---
|
||||
|
||||
### json2nginx.py
|
||||
|
||||
Converts OWASP JSON rules to Nginx WAF configuration.
|
||||
|
||||
```bash
|
||||
python json2nginx.py
|
||||
```
|
||||
|
||||
**Input**: `owasp_rules.json`
|
||||
**Output**: `waf_patterns/nginx/`
|
||||
|
||||
**Generated Files**:
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `waf_maps.conf` | Map directives (http block) |
|
||||
| `waf_rules.conf` | If statements (server block) |
|
||||
| `README.md` | Integration instructions |
|
||||
|
||||
**Environment Variables**:
|
||||
- `INPUT_FILE` - Path to OWASP JSON (default: `owasp_rules.json`)
|
||||
- `OUTPUT_DIR` - Output directory (default: `waf_patterns/nginx`)
|
||||
|
||||
---
|
||||
|
||||
### json2apache.py
|
||||
|
||||
Converts OWASP JSON rules to Apache ModSecurity format.
|
||||
|
||||
```bash
|
||||
python json2apache.py
|
||||
```
|
||||
|
||||
**Input**: `owasp_rules.json`
|
||||
**Output**: `waf_patterns/apache/`
|
||||
|
||||
**Generated Files**:
|
||||
- Category-specific `.conf` files (sqli.conf, xss.conf, etc.)
|
||||
- Each file contains ModSecurity `SecRule` directives
|
||||
|
||||
---
|
||||
|
||||
### json2traefik.py
|
||||
|
||||
Converts OWASP JSON rules to Traefik middleware configuration.
|
||||
|
||||
```bash
|
||||
python json2traefik.py
|
||||
```
|
||||
|
||||
**Input**: `owasp_rules.json`
|
||||
**Output**: `waf_patterns/traefik/`
|
||||
|
||||
**Generated Files**:
|
||||
- `middleware.toml` - Traefik middleware configuration
|
||||
- `README.md` - Integration instructions
|
||||
|
||||
---
|
||||
|
||||
### json2haproxy.py
|
||||
|
||||
Converts OWASP JSON rules to HAProxy ACL format.
|
||||
|
||||
```bash
|
||||
python json2haproxy.py
|
||||
```
|
||||
|
||||
**Input**: `owasp_rules.json`
|
||||
**Output**: `waf_patterns/haproxy/`
|
||||
|
||||
**Generated Files**:
|
||||
- `waf.acl` - Main WAF ACL rules
|
||||
- `README.md` - Integration instructions
|
||||
|
||||
---
|
||||
|
||||
### badbots.py
|
||||
|
||||
Generates bad bot blocking configurations from public bot lists.
|
||||
|
||||
```bash
|
||||
python badbots.py
|
||||
```
|
||||
|
||||
**Output**: Bot configurations in each `waf_patterns/*/` directory
|
||||
|
||||
**Features**:
|
||||
- Fetches from multiple public bot lists
|
||||
- Includes fallback sources for reliability
|
||||
- Generates platform-specific configs
|
||||
|
||||
---
|
||||
|
||||
## Import Scripts
|
||||
|
||||
These scripts help import existing WAF configurations.
|
||||
|
||||
### import_nginx_waf.py
|
||||
|
||||
Import Nginx WAF patterns from external sources.
|
||||
|
||||
```bash
|
||||
python import_nginx_waf.py --source /path/to/external/rules
|
||||
```
|
||||
|
||||
### import_apache_waf.py
|
||||
|
||||
Import Apache ModSecurity rules.
|
||||
|
||||
```bash
|
||||
python import_apache_waf.py --source /path/to/modsec/rules
|
||||
```
|
||||
|
||||
### import_traefik_waf.py
|
||||
|
||||
Import Traefik middleware configurations.
|
||||
|
||||
```bash
|
||||
python import_traefik_waf.py --source /path/to/traefik/config
|
||||
```
|
||||
|
||||
### import_haproxy_waf.py
|
||||
|
||||
Import HAProxy ACL rules.
|
||||
|
||||
```bash
|
||||
python import_haproxy_waf.py --source /path/to/haproxy/acl
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Data Structures
|
||||
|
||||
### owasp_rules.json Format
|
||||
|
||||
```json
|
||||
[
|
||||
{
|
||||
"id": "942100",
|
||||
"pattern": "(?i:union.*select)",
|
||||
"category": "sqli",
|
||||
"severity": "critical",
|
||||
"location": "request-uri",
|
||||
"description": "SQL Injection Attack Detected"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
**Fields**:
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `id` | string | OWASP CRS rule ID |
|
||||
| `pattern` | string | Regex pattern |
|
||||
| `category` | string | Attack category (sqli, xss, rce, etc.) |
|
||||
| `severity` | string | critical, high, medium, low |
|
||||
| `location` | string | Where to match (request-uri, headers, etc.) |
|
||||
| `description` | string | Human-readable description |
|
||||
|
||||
---
|
||||
|
||||
## Extending the Project
|
||||
|
||||
### Adding a New Platform
|
||||
|
||||
1. Create `json2<platform>.py` based on existing converters
|
||||
2. Add output directory in `waf_patterns/<platform>/`
|
||||
3. Update GitHub Actions workflow
|
||||
4. Add documentation in `docs/`
|
||||
|
||||
### Custom Pattern Sources
|
||||
|
||||
Modify `owasp2json.py` to add new pattern sources:
|
||||
|
||||
```python
|
||||
SOURCES = [
|
||||
"coreruleset/coreruleset",
|
||||
"your-org/your-rules",
|
||||
]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
Listed in `requirements.txt`:
|
||||
|
||||
```
|
||||
requests>=2.28.0
|
||||
beautifulsoup4>=4.11.0
|
||||
```
|
||||
|
||||
Install with:
|
||||
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
191
docs/badbots.md
Normal file
191
docs/badbots.md
Normal file
@ -0,0 +1,191 @@
|
||||
# Bad Bot Detection
|
||||
|
||||
This guide explains how to use the bad bot detection feature to block malicious crawlers and scrapers.
|
||||
|
||||
## Overview
|
||||
|
||||
The `badbots.py` script generates configuration files to block known malicious bots based on their User-Agent strings. It fetches bot lists from multiple public sources and generates blocking rules for each supported web server.
|
||||
|
||||
## How It Works
|
||||
|
||||
1. Fetches bot lists from public sources:
|
||||
- [ai.robots.txt](https://github.com/ai-robots-txt/ai.robots.txt)
|
||||
- Various community-maintained bot lists
|
||||
2. Generates blocking configurations for each platform
|
||||
3. Updates configurations daily via GitHub Actions
|
||||
|
||||
## Generated Files
|
||||
|
||||
| Platform | File | Format |
|
||||
|----------|------|--------|
|
||||
| Nginx | `bots.conf` | Map directive |
|
||||
| Apache | `bots.conf` | ModSecurity rules |
|
||||
| Traefik | `bots.toml` | Middleware config |
|
||||
| HAProxy | `bots.acl` | ACL patterns |
|
||||
|
||||
## Nginx Bot Blocker
|
||||
|
||||
The Nginx configuration uses a map directive:
|
||||
|
||||
```nginx
|
||||
# In http block
|
||||
map $http_user_agent $bad_bot {
|
||||
default 0;
|
||||
"~*AhrefsBot" 1;
|
||||
"~*SemrushBot" 1;
|
||||
"~*MJ12bot" 1;
|
||||
"~*DotBot" 1;
|
||||
# ... more bots
|
||||
}
|
||||
|
||||
# In server block
|
||||
if ($bad_bot) {
|
||||
return 403;
|
||||
}
|
||||
```
|
||||
|
||||
### Integration
|
||||
|
||||
```nginx
|
||||
http {
|
||||
include /path/to/waf_patterns/nginx/bots.conf;
|
||||
|
||||
server {
|
||||
if ($bad_bot) {
|
||||
return 403;
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Apache Bot Blocker
|
||||
|
||||
Uses ModSecurity rules:
|
||||
|
||||
```apache
|
||||
SecRule REQUEST_HEADERS:User-Agent "@rx AhrefsBot" \
|
||||
"id:200001,phase:1,deny,status:403,msg:'Bad Bot Blocked'"
|
||||
```
|
||||
|
||||
## HAProxy Bot Blocker
|
||||
|
||||
Uses ACL rules:
|
||||
|
||||
```haproxy
|
||||
acl bad_bot hdr_reg(User-Agent) -i -f /etc/haproxy/bots.acl
|
||||
http-request deny if bad_bot
|
||||
```
|
||||
|
||||
## Blocked Bot Categories
|
||||
|
||||
The following categories of bots are blocked by default:
|
||||
|
||||
### SEO/Marketing Crawlers
|
||||
- AhrefsBot
|
||||
- SemrushBot
|
||||
- MJ12bot
|
||||
- DotBot
|
||||
- BLEXBot
|
||||
|
||||
### AI/ML Crawlers
|
||||
- GPTBot
|
||||
- ChatGPT-User
|
||||
- CCBot
|
||||
- Google-Extended
|
||||
- Anthropic-AI
|
||||
|
||||
### Scrapers
|
||||
- DataForSeoBot
|
||||
- PetalBot
|
||||
- Bytespider
|
||||
- ClaudeBot
|
||||
|
||||
### Malicious Bots
|
||||
- Known vulnerability scanners
|
||||
- Spam bots
|
||||
- Content scrapers
|
||||
|
||||
## Customization
|
||||
|
||||
### Add Custom Bots
|
||||
|
||||
Edit the generated file or add your own patterns:
|
||||
|
||||
```nginx
|
||||
# Nginx: Add to bots.conf
|
||||
"~*MyCustomBot" 1;
|
||||
```
|
||||
|
||||
```apache
|
||||
# Apache: Add rule
|
||||
SecRule REQUEST_HEADERS:User-Agent "@rx MyCustomBot" \
|
||||
"id:200999,deny"
|
||||
```
|
||||
|
||||
### Whitelist Bots
|
||||
|
||||
For Nginx, allow specific bots:
|
||||
|
||||
```nginx
|
||||
map $http_user_agent $bad_bot {
|
||||
default 0;
|
||||
"~*Googlebot" 0; # Allow Google
|
||||
"~*AhrefsBot" 1; # Block Ahrefs
|
||||
}
|
||||
```
|
||||
|
||||
### Allow All Bots for Specific Paths
|
||||
|
||||
```nginx
|
||||
location /public-api {
|
||||
# Override bot blocking
|
||||
if ($bad_bot) {
|
||||
# Don't block here
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Generate Manually
|
||||
|
||||
Run the script to regenerate bot lists:
|
||||
|
||||
```bash
|
||||
python badbots.py
|
||||
```
|
||||
|
||||
The script supports fallback lists if primary sources are unavailable.
|
||||
|
||||
## Monitoring
|
||||
|
||||
### Log Blocked Bots
|
||||
|
||||
Enable logging to track blocked requests:
|
||||
|
||||
```nginx
|
||||
if ($bad_bot) {
|
||||
access_log /var/log/nginx/blocked_bots.log;
|
||||
return 403;
|
||||
}
|
||||
```
|
||||
|
||||
### Analyze Bot Traffic
|
||||
|
||||
```bash
|
||||
# Count blocked bot requests
|
||||
grep "403" /var/log/nginx/access.log | \
|
||||
awk '{print $12}' | sort | uniq -c | sort -rn | head -20
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Regular Updates**: The bot lists are updated daily. Pull the latest changes or download from releases.
|
||||
|
||||
2. **Monitor False Positives**: Some legitimate services may use blocked User-Agents. Monitor your logs.
|
||||
|
||||
3. **Combine with Rate Limiting**: Use bot blocking with rate limiting for comprehensive protection.
|
||||
|
||||
4. **Test Before Deploying**: Verify that legitimate traffic (search engines, monitoring) is not blocked.
|
||||
|
||||
::: warning
|
||||
Blocking search engine bots (Googlebot, Bingbot) can negatively impact SEO. The default lists do **not** block major search engines.
|
||||
:::
|
||||
78
docs/getting-started.md
Normal file
78
docs/getting-started.md
Normal file
@ -0,0 +1,78 @@
|
||||
# Getting Started
|
||||
|
||||
This guide will help you get up and running with Patterns WAF configurations for your web server.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- **Python 3.11+** (if building from source)
|
||||
- **pip** (Python package installer)
|
||||
- **git** (for cloning the repository)
|
||||
|
||||
## Installation Options
|
||||
|
||||
### Option 1: Download Pre-Generated Configurations
|
||||
|
||||
The easiest way to get started is to download pre-built configurations:
|
||||
|
||||
1. Go to the [Releases](https://github.com/fabriziosalmi/patterns/releases) page
|
||||
2. Download the ZIP file for your web server:
|
||||
- `nginx_waf.zip` - Nginx configurations
|
||||
- `apache_waf.zip` - Apache ModSecurity rules
|
||||
- `traefik_waf.zip` - Traefik middleware
|
||||
- `haproxy_waf.zip` - HAProxy ACL files
|
||||
3. Extract and integrate into your server configuration
|
||||
|
||||
### Option 2: Build from Source
|
||||
|
||||
If you prefer to generate the configurations yourself:
|
||||
|
||||
```bash
|
||||
# Clone the repository
|
||||
git clone https://github.com/fabriziosalmi/patterns.git
|
||||
cd patterns
|
||||
|
||||
# Install dependencies
|
||||
pip install -r requirements.txt
|
||||
|
||||
# Fetch latest OWASP rules
|
||||
python owasp2json.py
|
||||
|
||||
# Generate configurations for your platform
|
||||
python json2nginx.py # For Nginx
|
||||
python json2apache.py # For Apache
|
||||
python json2traefik.py # For Traefik
|
||||
python json2haproxy.py # For HAProxy
|
||||
|
||||
# Generate bad bot blockers
|
||||
python badbots.py
|
||||
```
|
||||
|
||||
## Configuration Files
|
||||
|
||||
After running the scripts, you'll find the generated files in the `waf_patterns/` directory:
|
||||
|
||||
```
|
||||
waf_patterns/
|
||||
├── nginx/ # Nginx WAF configs
|
||||
├── apache/ # Apache ModSecurity rules
|
||||
├── traefik/ # Traefik middleware configs
|
||||
└── haproxy/ # HAProxy ACL files
|
||||
```
|
||||
|
||||
## Next Steps
|
||||
|
||||
Choose your web server to learn how to integrate the WAF configurations:
|
||||
|
||||
- [Nginx Integration](/nginx)
|
||||
- [Apache Integration](/apache)
|
||||
- [Traefik Integration](/traefik)
|
||||
- [HAProxy Integration](/haproxy)
|
||||
|
||||
## Automatic Updates
|
||||
|
||||
The repository includes a GitHub Actions workflow that:
|
||||
- Fetches the latest OWASP CRS rules **daily**
|
||||
- Regenerates all WAF configurations
|
||||
- Creates a new release with updated files
|
||||
|
||||
To get the latest rules, simply download from the [Releases](https://github.com/fabriziosalmi/patterns/releases) page or pull the latest changes if you cloned the repository.
|
||||
192
docs/haproxy.md
Normal file
192
docs/haproxy.md
Normal file
@ -0,0 +1,192 @@
|
||||
# HAProxy Integration
|
||||
|
||||
This guide explains how to integrate the WAF patterns with HAProxy using ACL rules.
|
||||
|
||||
## Quick Start
|
||||
|
||||
1. Download `haproxy_waf.zip` from [Releases](https://github.com/fabriziosalmi/patterns/releases)
|
||||
2. Extract the files
|
||||
3. Include the ACL files in your HAProxy configuration
|
||||
|
||||
## Configuration Files
|
||||
|
||||
The HAProxy WAF package includes:
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `waf.acl` | Main WAF ACL rules |
|
||||
| `bots.acl` | Bad bot detection ACLs |
|
||||
|
||||
## Integration
|
||||
|
||||
### Step 1: Include ACL Files
|
||||
|
||||
In your `haproxy.cfg`, include the WAF ACL files:
|
||||
|
||||
```haproxy
|
||||
frontend http-in
|
||||
bind *:80
|
||||
|
||||
# Include WAF ACL rules
|
||||
acl waf_block_sqli path_reg -i union.*select
|
||||
acl waf_block_sqli path_reg -i insert.*into
|
||||
acl waf_block_xss path_reg -i <script>
|
||||
|
||||
# Or include from external file
|
||||
# acl waf_patterns path_reg -i -f /etc/haproxy/waf.acl
|
||||
|
||||
# Block matching requests
|
||||
http-request deny if waf_block_sqli
|
||||
http-request deny if waf_block_xss
|
||||
|
||||
default_backend servers
|
||||
```
|
||||
|
||||
### Step 2: Include Bot Blockers
|
||||
|
||||
```haproxy
|
||||
frontend http-in
|
||||
bind *:80
|
||||
|
||||
# Bad bot detection
|
||||
acl bad_bot hdr_reg(User-Agent) -i -f /etc/haproxy/bots.acl
|
||||
http-request deny if bad_bot
|
||||
|
||||
default_backend servers
|
||||
```
|
||||
|
||||
### Step 3: Reload HAProxy
|
||||
|
||||
```bash
|
||||
haproxy -c -f /etc/haproxy/haproxy.cfg && sudo systemctl reload haproxy
|
||||
```
|
||||
|
||||
## ACL Rule Format
|
||||
|
||||
HAProxy ACLs use pattern matching on various request attributes:
|
||||
|
||||
```haproxy
|
||||
# Match path
|
||||
acl sqli_path path_reg -i union.*select
|
||||
|
||||
# Match query string
|
||||
acl sqli_query url_param(id) -m reg -i union.*select
|
||||
|
||||
# Match headers
|
||||
acl bad_referer hdr_reg(Referer) -i malicious-site\.com
|
||||
|
||||
# Combined conditions
|
||||
http-request deny if sqli_path OR sqli_query
|
||||
```
|
||||
|
||||
## Complete Example
|
||||
|
||||
```haproxy
|
||||
global
|
||||
log /dev/log local0
|
||||
maxconn 4096
|
||||
|
||||
defaults
|
||||
mode http
|
||||
log global
|
||||
option httplog
|
||||
timeout connect 5s
|
||||
timeout client 50s
|
||||
timeout server 50s
|
||||
|
||||
frontend http-in
|
||||
bind *:80
|
||||
|
||||
# WAF Rules
|
||||
acl waf_sqli path_reg -i (union.*select|insert.*into|delete.*from)
|
||||
acl waf_xss path_reg -i (<script|javascript:|on\w+\s*=)
|
||||
acl waf_lfi path_reg -i (\.\.\/|\.\.\\)
|
||||
acl waf_rce path_reg -i (;|\||`|\$\()
|
||||
|
||||
# Bot blocking
|
||||
acl bad_bot hdr_reg(User-Agent) -i (AhrefsBot|SemrushBot|MJ12bot)
|
||||
|
||||
# Deny malicious requests
|
||||
http-request deny deny_status 403 if waf_sqli
|
||||
http-request deny deny_status 403 if waf_xss
|
||||
http-request deny deny_status 403 if waf_lfi
|
||||
http-request deny deny_status 403 if waf_rce
|
||||
http-request deny deny_status 403 if bad_bot
|
||||
|
||||
default_backend servers
|
||||
|
||||
backend servers
|
||||
balance roundrobin
|
||||
server server1 127.0.0.1:8080 check
|
||||
```
|
||||
|
||||
## Customization
|
||||
|
||||
### Custom Error Pages
|
||||
|
||||
Return a custom error page for blocked requests:
|
||||
|
||||
```haproxy
|
||||
http-request deny deny_status 403 content-type text/html \
|
||||
string "Access Denied" if waf_sqli
|
||||
```
|
||||
|
||||
### Logging Blocked Requests
|
||||
|
||||
Create a dedicated log for WAF blocks:
|
||||
|
||||
```haproxy
|
||||
frontend http-in
|
||||
# Log blocked requests
|
||||
http-request set-var(txn.blocked) str(1) if waf_sqli
|
||||
http-request capture var(txn.blocked) len 1
|
||||
|
||||
# Custom log format
|
||||
log-format "%ci:%cp [%t] %ft %b/%s %Tq/%Tw/%Tc/%Tr/%Tt %ST %B %CC %CS %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq blocked=%[var(txn.blocked)]"
|
||||
```
|
||||
|
||||
### Whitelist Paths
|
||||
|
||||
Skip WAF for specific paths:
|
||||
|
||||
```haproxy
|
||||
acl is_api path_beg /api/webhook
|
||||
http-request deny if waf_sqli !is_api
|
||||
```
|
||||
|
||||
## Rate Limiting
|
||||
|
||||
Combine WAF with rate limiting:
|
||||
|
||||
```haproxy
|
||||
# Stick table for rate limiting
|
||||
stick-table type ip size 100k expire 30s store http_req_rate(10s)
|
||||
http-request track-sc0 src
|
||||
acl too_many_requests sc_http_req_rate(0) gt 100
|
||||
|
||||
http-request deny if too_many_requests
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
```bash
|
||||
# Test SQL injection detection
|
||||
curl -I "http://example.com/?id=1' UNION SELECT * FROM users--"
|
||||
|
||||
# Test bot blocking
|
||||
curl -A "AhrefsBot" -I "http://example.com/"
|
||||
|
||||
# Check HAProxy stats
|
||||
echo "show stat" | socat stdio /var/run/haproxy.sock
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### ACLs not matching
|
||||
Use `haproxy -c -f haproxy.cfg` to validate syntax. Enable debug logging to see ACL evaluation.
|
||||
|
||||
### Performance impact
|
||||
ACL evaluation is fast, but complex regex patterns can add latency. Test with realistic traffic.
|
||||
|
||||
### Configuration too large
|
||||
HAProxy has limits on configuration size. Consider splitting large ACL lists into multiple files.
|
||||
63
docs/index.md
Normal file
63
docs/index.md
Normal file
@ -0,0 +1,63 @@
|
||||
---
|
||||
layout: home
|
||||
|
||||
hero:
|
||||
name: Patterns
|
||||
text: OWASP WAF Rules for Web Servers
|
||||
tagline: Automated OWASP CRS patterns and Bad Bot detection for Nginx, Apache, Traefik, and HAProxy
|
||||
image:
|
||||
src: /shield.svg
|
||||
alt: Patterns
|
||||
actions:
|
||||
- theme: brand
|
||||
text: Get Started
|
||||
link: /getting-started
|
||||
- theme: alt
|
||||
text: View on GitHub
|
||||
link: https://github.com/fabriziosalmi/patterns
|
||||
|
||||
features:
|
||||
- icon: 🛡️
|
||||
title: OWASP CRS Protection
|
||||
details: Leverages OWASP Core Rule Set for web application firewall defense against SQLi, XSS, RCE, and LFI attacks.
|
||||
- icon: 🤖
|
||||
title: Bad Bot Blocking
|
||||
details: Blocks known malicious bots and scrapers using regularly updated public bot lists.
|
||||
- icon: ⚙️
|
||||
title: Multi-Server Support
|
||||
details: Generates WAF configs for Nginx, Apache, Traefik, and HAProxy with consistent protection across platforms.
|
||||
- icon: 🔄
|
||||
title: Daily Updates
|
||||
details: GitHub Actions automatically fetch new OWASP rules daily and push updated configurations.
|
||||
- icon: 📦
|
||||
title: Pre-Generated Configs
|
||||
details: Download ready-to-use WAF configurations from GitHub Releases without building from source.
|
||||
- icon: 🧩
|
||||
title: Extensible Design
|
||||
details: Modular architecture makes it easy to extend support to other web servers or load balancers.
|
||||
---
|
||||
|
||||
## Quick Start
|
||||
|
||||
Download the latest configurations from [GitHub Releases](https://github.com/fabriziosalmi/patterns/releases) or build from source:
|
||||
|
||||
```bash
|
||||
git clone https://github.com/fabriziosalmi/patterns.git
|
||||
cd patterns
|
||||
pip install -r requirements.txt
|
||||
python owasp2json.py
|
||||
python json2nginx.py # or json2apache.py, json2traefik.py, json2haproxy.py
|
||||
```
|
||||
|
||||
## Supported Platforms
|
||||
|
||||
| Platform | Config Format | Documentation |
|
||||
|----------|---------------|---------------|
|
||||
| **Nginx** | `.conf` files | [Read more →](/nginx) |
|
||||
| **Apache** | ModSecurity rules | [Read more →](/apache) |
|
||||
| **Traefik** | Middleware TOML | [Read more →](/traefik) |
|
||||
| **HAProxy** | ACL files | [Read more →](/haproxy) |
|
||||
|
||||
::: tip Using Caddy?
|
||||
Check out the [caddy-waf](https://github.com/fabriziosalmi/caddy-waf) project for Caddy-specific WAF support.
|
||||
:::
|
||||
131
docs/nginx.md
Normal file
131
docs/nginx.md
Normal file
@ -0,0 +1,131 @@
|
||||
# Nginx Integration
|
||||
|
||||
This guide explains how to integrate the WAF patterns into your Nginx configuration.
|
||||
|
||||
## Quick Start
|
||||
|
||||
1. Download `nginx_waf.zip` from [Releases](https://github.com/fabriziosalmi/patterns/releases)
|
||||
2. Extract to your Nginx configuration directory
|
||||
3. Include the configuration files as shown below
|
||||
|
||||
## Configuration Files
|
||||
|
||||
The Nginx WAF package includes:
|
||||
|
||||
| File | Purpose | Include Location |
|
||||
|------|---------|------------------|
|
||||
| `waf_maps.conf` | Map directives for pattern matching | `http` block |
|
||||
| `waf_rules.conf` | If statements for blocking | `server` block |
|
||||
| `bots.conf` | Bad bot detection maps | `http` block |
|
||||
|
||||
## Integration
|
||||
|
||||
### Step 1: Include Maps in HTTP Block
|
||||
|
||||
The map directives **must** be included in the `http` context:
|
||||
|
||||
```nginx
|
||||
http {
|
||||
# Include WAF maps (pattern definitions)
|
||||
include /path/to/waf_patterns/nginx/waf_maps.conf;
|
||||
|
||||
# Include bot detection maps
|
||||
include /path/to/waf_patterns/nginx/bots.conf;
|
||||
|
||||
# ... other http configurations ...
|
||||
}
|
||||
```
|
||||
|
||||
### Step 2: Include Rules in Server Block
|
||||
|
||||
The blocking rules go inside your `server` or `location` block:
|
||||
|
||||
```nginx
|
||||
server {
|
||||
listen 80;
|
||||
server_name example.com;
|
||||
|
||||
# Include WAF rules
|
||||
include /path/to/waf_patterns/nginx/waf_rules.conf;
|
||||
|
||||
# ... other server configurations ...
|
||||
}
|
||||
```
|
||||
|
||||
### Step 3: Reload Nginx
|
||||
|
||||
Test and reload the configuration:
|
||||
|
||||
```bash
|
||||
sudo nginx -t && sudo systemctl reload nginx
|
||||
```
|
||||
|
||||
## How It Works
|
||||
|
||||
The WAF uses Nginx's `map` directive for efficient pattern matching:
|
||||
|
||||
```nginx
|
||||
map $request_uri $waf_block_sqli {
|
||||
default 0;
|
||||
"~*union.*select" 1;
|
||||
"~*insert.*into" 1;
|
||||
}
|
||||
|
||||
if ($waf_block_sqli) {
|
||||
return 403;
|
||||
}
|
||||
```
|
||||
|
||||
## Customization
|
||||
|
||||
### Enable Logging
|
||||
|
||||
To log blocked requests, edit `waf_rules.conf` and uncomment the logging lines:
|
||||
|
||||
```nginx
|
||||
if ($waf_block_sqli) {
|
||||
return 403;
|
||||
access_log /var/log/nginx/waf_blocked.log;
|
||||
}
|
||||
```
|
||||
|
||||
### Whitelist Specific Paths
|
||||
|
||||
Add exceptions before the WAF rules:
|
||||
|
||||
```nginx
|
||||
location /api/webhook {
|
||||
# Skip WAF for this path
|
||||
# ... your configuration ...
|
||||
}
|
||||
|
||||
# WAF rules for other paths
|
||||
include /path/to/waf_patterns/nginx/waf_rules.conf;
|
||||
```
|
||||
|
||||
::: warning Important
|
||||
Individual category files like `attack.conf` or `xss.conf` should **not** be included directly. They contain both `map` and `if` directives which cannot be used in the same context. Always use `waf_maps.conf` + `waf_rules.conf`.
|
||||
:::
|
||||
|
||||
## Testing
|
||||
|
||||
Test your WAF configuration with common attack patterns:
|
||||
|
||||
```bash
|
||||
# Should be blocked (SQL injection)
|
||||
curl -I "http://example.com/?id=1' OR '1'='1"
|
||||
|
||||
# Should be blocked (XSS)
|
||||
curl -I "http://example.com/?q=<script>alert(1)</script>"
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Configuration errors
|
||||
Always run `nginx -t` before reloading to catch syntax errors.
|
||||
|
||||
### False positives
|
||||
If legitimate requests are being blocked, check `/var/log/nginx/error.log` and consider adding path-specific exceptions.
|
||||
|
||||
### Performance
|
||||
The map-based approach is highly efficient. For high-traffic sites, consider enabling caching for the map variables.
|
||||
2468
docs/package-lock.json
generated
Normal file
2468
docs/package-lock.json
generated
Normal file
File diff suppressed because it is too large
Load Diff
14
docs/package.json
Normal file
14
docs/package.json
Normal file
@ -0,0 +1,14 @@
|
||||
{
|
||||
"name": "patterns-docs",
|
||||
"version": "1.0.0",
|
||||
"private": true,
|
||||
"type": "module",
|
||||
"scripts": {
|
||||
"docs:dev": "vitepress dev",
|
||||
"docs:build": "vitepress build",
|
||||
"docs:preview": "vitepress preview"
|
||||
},
|
||||
"devDependencies": {
|
||||
"vitepress": "^1.5.0"
|
||||
}
|
||||
}
|
||||
168
docs/traefik.md
Normal file
168
docs/traefik.md
Normal file
@ -0,0 +1,168 @@
|
||||
# Traefik Integration
|
||||
|
||||
This guide explains how to integrate the WAF patterns with Traefik using middleware plugins.
|
||||
|
||||
## Quick Start
|
||||
|
||||
1. Download `traefik_waf.zip` from [Releases](https://github.com/fabriziosalmi/patterns/releases)
|
||||
2. Extract the files
|
||||
3. Configure the middleware in your Traefik configuration
|
||||
|
||||
## Configuration Files
|
||||
|
||||
The Traefik WAF package includes:
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `middleware.toml` | WAF middleware configuration |
|
||||
| `bots.toml` | Bad bot detection rules |
|
||||
|
||||
## Integration with File Provider
|
||||
|
||||
### Step 1: Enable File Provider
|
||||
|
||||
In your `traefik.toml` or `traefik.yml`:
|
||||
|
||||
::: code-group
|
||||
|
||||
```toml [traefik.toml]
|
||||
[providers]
|
||||
[providers.file]
|
||||
directory = "/etc/traefik/dynamic"
|
||||
watch = true
|
||||
```
|
||||
|
||||
```yaml [traefik.yml]
|
||||
providers:
|
||||
file:
|
||||
directory: /etc/traefik/dynamic
|
||||
watch: true
|
||||
```
|
||||
|
||||
:::
|
||||
|
||||
### Step 2: Copy Middleware Files
|
||||
|
||||
Copy the WAF configuration files to your dynamic configuration directory:
|
||||
|
||||
```bash
|
||||
cp waf_patterns/traefik/*.toml /etc/traefik/dynamic/
|
||||
```
|
||||
|
||||
### Step 3: Apply Middleware to Routes
|
||||
|
||||
Reference the middleware in your router configuration:
|
||||
|
||||
::: code-group
|
||||
|
||||
```toml [dynamic/routes.toml]
|
||||
[http.routers.my-router]
|
||||
rule = "Host(`example.com`)"
|
||||
service = "my-service"
|
||||
middlewares = ["waf-protection", "bot-blocker"]
|
||||
|
||||
[http.middlewares.waf-protection.plugin.waf]
|
||||
# WAF configuration loaded from middleware.toml
|
||||
|
||||
[http.middlewares.bot-blocker.plugin.botblocker]
|
||||
# Bot blocking loaded from bots.toml
|
||||
```
|
||||
|
||||
```yaml [dynamic/routes.yml]
|
||||
http:
|
||||
routers:
|
||||
my-router:
|
||||
rule: "Host(`example.com`)"
|
||||
service: my-service
|
||||
middlewares:
|
||||
- waf-protection
|
||||
- bot-blocker
|
||||
```
|
||||
|
||||
:::
|
||||
|
||||
## Integration with Docker Labels
|
||||
|
||||
For Docker-based deployments:
|
||||
|
||||
```yaml
|
||||
services:
|
||||
my-app:
|
||||
image: my-app:latest
|
||||
labels:
|
||||
- "traefik.enable=true"
|
||||
- "traefik.http.routers.my-app.rule=Host(`example.com`)"
|
||||
- "traefik.http.routers.my-app.middlewares=waf@file"
|
||||
```
|
||||
|
||||
## Middleware Configuration
|
||||
|
||||
The `middleware.toml` contains regex-based blocking rules:
|
||||
|
||||
```toml
|
||||
[http.middlewares.waf.plugin.rewriteHeaders]
|
||||
# SQL Injection patterns
|
||||
[[http.middlewares.waf.plugin.rewriteHeaders.replacements]]
|
||||
regex = "(?i)union.*select"
|
||||
replacement = "BLOCKED"
|
||||
```
|
||||
|
||||
## Using with Traefik Plugins
|
||||
|
||||
For enhanced WAF capabilities, consider using community plugins:
|
||||
|
||||
```yaml
|
||||
experimental:
|
||||
plugins:
|
||||
waf:
|
||||
moduleName: "github.com/example/traefik-waf-plugin"
|
||||
version: "v1.0.0"
|
||||
```
|
||||
|
||||
## Customization
|
||||
|
||||
### Add Custom Patterns
|
||||
|
||||
Edit `middleware.toml` to add your own patterns:
|
||||
|
||||
```toml
|
||||
[[http.middlewares.waf.plugin.rewriteHeaders.replacements]]
|
||||
regex = "your-custom-pattern"
|
||||
replacement = "BLOCKED"
|
||||
```
|
||||
|
||||
### Logging
|
||||
|
||||
Enable access logs to monitor blocked requests:
|
||||
|
||||
```toml
|
||||
[accessLog]
|
||||
filePath = "/var/log/traefik/access.log"
|
||||
format = "json"
|
||||
|
||||
[accessLog.fields]
|
||||
[accessLog.fields.headers]
|
||||
defaultMode = "keep"
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
```bash
|
||||
# Test WAF detection
|
||||
curl -H "Host: example.com" \
|
||||
"http://localhost/?id=1' OR '1'='1"
|
||||
|
||||
# Check Traefik logs
|
||||
docker logs traefik 2>&1 | grep -i blocked
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Middleware not loading
|
||||
Check that the file provider is correctly configured and watching the right directory.
|
||||
|
||||
### Routes not applying middleware
|
||||
Ensure the middleware name matches exactly between router and middleware definition.
|
||||
|
||||
### Performance considerations
|
||||
Traefik's regex-based middleware can impact performance at high traffic. Monitor latency after enabling WAF rules.
|
||||
Loading…
x
Reference in New Issue
Block a user