README.md

<h1 align="center">Krawl</h1>

<h3 align="center">
  <a name="readme-top"></a>
  <img
    src="img/krawl-svg.svg"
    height="250"
  >
</h3>
<div align="center">

<p align="center">
  A modern, customizable web honeypot server designed to detect and track malicious activity from attackers and web crawlers through deceptive web pages, fake credentials, and canary tokens.
</p>

<div align="center">
  <a href="https://github.com/blessedrebus/krawl/blob/main/LICENSE">
    <img src="https://img.shields.io/github/license/blessedrebus/krawl" alt="License">
  </a>
  <a href="https://github.com/blessedrebus/krawl/releases">
    <img src="https://img.shields.io/github/v/release/blessedrebus/krawl" alt="Release">
  </a>
</div>

<div align="center">
  <a href="https://ghcr.io/blessedrebus/krawl">
    <img src="https://img.shields.io/badge/ghcr.io-krawl-blue" alt="GitHub Container Registry">
  </a>
  <a href="https://kubernetes.io/">
    <img src="https://img.shields.io/badge/kubernetes-ready-326CE5?logo=kubernetes&logoColor=white" alt="Kubernetes">
  </a>
  <a href="https://github.com/BlessedRebuS/Krawl/pkgs/container/krawl-chart">
    <img src="https://img.shields.io/badge/helm-chart-0F1689?logo=helm&logoColor=white" alt="Helm Chart">
  </a>
</div>

<br>

<p align="center">
  <a href="#what-is-krawl">What is Krawl?</a> •
  <a href="#-installation">Installation</a> •
  <a href="#honeypot-pages">Honeypot Pages</a> •
  <a href="#dashboard">Dashboard</a> •
  <a href="./ToDo.md">Todo</a> •
  <a href="#-contributing">Contributing</a>
</p>

<br>
</div>

## Demo
Tip: crawl the `robots.txt` paths for additional fun
### Krawl URL: [http://demo.krawlme.com](http://demo.krawlme.com)
### View the dashboard [http://demo.krawlme.com/das_dashboard](http://demo.krawlme.com/das_dashboard)

## What is Krawl?

**Krawl** is a cloud‑native deception server designed to detect, delay, and analyze malicious attackers, web crawlers and automated scanners.

It creates realistic fake web applications filled with low‑hanging fruit such as admin panels, configuration files, and exposed fake credentials to attract and identify suspicious activity.

By wasting attacker resources, Krawl helps clearly distinguish malicious behavior from legitimate crawlers.

It features:

- **Spider Trap Pages**: Infinite random links to waste crawler resources based on the [spidertrap project](https://github.com/adhdproject/spidertrap)
- **Fake Login Pages**: WordPress, phpMyAdmin, admin panels
- **Honeypot Paths**: Advertised in robots.txt to catch scanners
- **Fake Credentials**: Realistic-looking usernames, passwords, API keys
- **[Canary Token](#customizing-the-canary-token) Integration**: External alert triggering
- **Random server headers**: Confuse attacks based on server header and version
- **Real-time Dashboard**: Monitor suspicious activity
- **Customizable Wordlists**: Easy JSON-based configuration
- **Random Error Injection**: Mimic real server behavior

![dashboard](img/deception-page.png)

![geoip](img/geoip_dashboard.png)

## 🚀 Installation

### Docker Run

Run Krawl with the latest image:

```bash
docker run -d \
  -p 5000:5000 \
  -e KRAWL_PORT=5000 \
  -e KRAWL_DELAY=100 \
  -e KRAWL_DASHBOARD_SECRET_PATH="/my-secret-dashboard" \
  -e KRAWL_DATABASE_RETENTION_DAYS=30 \
  --name krawl \
  ghcr.io/blessedrebus/krawl:latest
```

Access the server at `http://localhost:5000`

### Docker Compose

Create a `docker-compose.yaml` file:

```yaml
services:
  krawl:
    image: ghcr.io/blessedrebus/krawl:latest
    container_name: krawl-server
    ports:
      - "5000:5000"
    environment:
      - CONFIG_LOCATION=config.yaml
      - TZ=Europe/Rome
    volumes:
      - ./config.yaml:/app/config.yaml:ro
      # bind mount for firewall exporters
      - ./exports:/app/exports
      - krawl-data:/app/data
    restart: unless-stopped

volumes:
  krawl-data:
```

Run with:

```bash
docker-compose up -d
```

Stop with:

```bash
docker-compose down
```

### Kubernetes
**Krawl is also available natively on Kubernetes**. Installation can be done either [via manifest](kubernetes/README.md) or [using the helm chart](helm/README.md).

## Use Krawl to Ban Malicious IPs
Krawl uses a reputation-based system to classify attacker IP addresses. Every five minutes, Krawl exports the identified malicious IPs to a `malicious_ips.txt` file.

This file can either be mounted from the Docker container into another system or downloaded directly via `curl`:

```bash
curl https://your-krawl-instance/<DASHBOARD-PATH>/api/download/malicious_ips.txt
```

This file enables automatic blocking of malicious traffic across various platforms. You can use it to update firewall rules on:
* [OPNsense and pfSense](https://www.allthingstech.ch/using-opnsense-and-ip-blocklists-to-block-malicious-traffic)
* [RouterOS](https://rentry.co/krawl-routeros)
* [IPtables](plugins/iptables/README.md) and [Nftables](plugins/nftables/README.md)
* [Fail2Ban](plugins/fail2ban/README.md)

## IP Reputation
Krawl [uses tasks that analyze recent traffic to build and continuously update an IP reputation](src/tasks/analyze_ips.py) score. It runs periodically and evaluates each active IP address based on multiple behavioral indicators to classify it as an attacker, crawler, or regular user. Thresholds are fully customizable.

![ip reputation](img/ip-reputation.png)

The analysis includes:
- **Risky HTTP methods usage** (e.g. POST, PUT, DELETE ratios)
- **Robots.txt violations**
- **Request timing anomalies** (bursty or irregular patterns)
- **User-Agent consistency**
- **Attack URL detection** (e.g. SQL injection, XSS patterns)

Each signal contributes to a weighted scoring model that assigns a reputation category:
- `attacker`
- `bad_crawler`
- `good_crawler`
- `regular_user`
- `unknown` (for insufficient data)

The resulting scores and metrics are stored in the database and used by Krawl to drive dashboards, reputation tracking, and automated mitigation actions such as IP banning or firewall integration.

## Forward server header
If Krawl is deployed behind a proxy such as NGINX the **server header** should be forwarded using the following configuration in your proxy:

```bash
location / {
    proxy_pass https://your-krawl-instance;
    proxy_pass_header Server;
}
```

## API
Krawl uses the following APIs
- http://ip-api.com (IP Data)
- https://iprep.lcrawl.com (IP Reputation)
- https://nominatim.openstreetmap.org/reverse (Reverse IP Lookup)
- https://api.ipify.org (Public IP discovery)
- http://ident.me (Public IP discovery)
- https://ifconfig.me (Public IP discovery)

## Configuration
Krawl uses a **configuration hierarchy** in which **environment variables take precedence over the configuration file**. This approach is recommended for Docker deployments and quick out-of-the-box customization.

### Configuration via Enviromental Variables

| Environment Variable | Description | Default |
|----------------------|-------------|---------|
| `CONFIG_LOCATION` | Path to yaml config file | `config.yaml` |
| `KRAWL_PORT` | Server listening port | `5000` |
| `KRAWL_DELAY` | Response delay in milliseconds | `100` |
| `KRAWL_SERVER_HEADER` | HTTP Server header for deception | `""` |
| `KRAWL_LINKS_LENGTH_RANGE` | Link length range as `min,max` | `5,15` |
| `KRAWL_LINKS_PER_PAGE_RANGE` | Links per page as `min,max` | `10,15` |
| `KRAWL_CHAR_SPACE` | Characters used for link generation | `abcdefgh...` |
| `KRAWL_MAX_COUNTER` | Initial counter value | `10` |
| `KRAWL_CANARY_TOKEN_URL` | External canary token URL | None |
| `KRAWL_CANARY_TOKEN_TRIES` | Requests before showing canary token | `10` |
| `KRAWL_DASHBOARD_SECRET_PATH` | Custom dashboard path | Auto-generated |
| `KRAWL_PROBABILITY_ERROR_CODES` | Error response probability (0-100%) | `0` |
| `KRAWL_DATABASE_PATH` | Database file location | `data/krawl.db` |
| `KRAWL_EXPORTS_PATH` | Path where firewalls rule sets are exported | `exports` |
| `KRAWL_BACKUPS_PATH` | Path where database dump are saved | `backups` |
| `KRAWL_BACKUPS_CRON` | cron expression to control backup job schedule | `*/30 * * * *` |
| `KRAWL_BACKUPS_ENABLED` | Boolean to enable db dump job | `true` |
| `KRAWL_DATABASE_RETENTION_DAYS` | Days to retain data in database | `30` |
| `KRAWL_HTTP_RISKY_METHODS_THRESHOLD` | Threshold for risky HTTP methods detection | `0.1` |
| `KRAWL_VIOLATED_ROBOTS_THRESHOLD` | Threshold for robots.txt violations | `0.1` |
| `KRAWL_UNEVEN_REQUEST_TIMING_THRESHOLD` | Coefficient of variation threshold for timing | `0.5` |
| `KRAWL_UNEVEN_REQUEST_TIMING_TIME_WINDOW_SECONDS` | Time window for request timing analysis in seconds | `300` |
| `KRAWL_USER_AGENTS_USED_THRESHOLD` | Threshold for detecting multiple user agents | `2` |
| `KRAWL_ATTACK_URLS_THRESHOLD` | Threshold for attack URL detection | `1` |
| `KRAWL_INFINITE_PAGES_FOR_MALICIOUS` | Serve infinite pages to malicious IPs | `true` |
| `KRAWL_MAX_PAGES_LIMIT` | Maximum page limit for crawlers | `250` |
| `KRAWL_BAN_DURATION_SECONDS` | Ban duration in seconds for rate-limited IPs | `600` |

For example

```bash
# Set canary token
export CONFIG_LOCATION="config.yaml"
export KRAWL_CANARY_TOKEN_URL="http://your-canary-token-url"

# Set number of pages range (min,max format)
export KRAWL_LINKS_PER_PAGE_RANGE="5,25"

# Set analyzer thresholds
export KRAWL_HTTP_RISKY_METHODS_THRESHOLD="0.2"
export KRAWL_VIOLATED_ROBOTS_THRESHOLD="0.15"

# Set custom dashboard path
export KRAWL_DASHBOARD_SECRET_PATH="/my-secret-dashboard"
```

Example of a Docker run with env variables:

```bash
docker run -d \
  -p 5000:5000 \
  -e KRAWL_PORT=5000 \
  -e KRAWL_DELAY=100 \
  -e KRAWL_CANARY_TOKEN_URL="http://your-canary-token-url" \
  --name krawl \
  ghcr.io/blessedrebus/krawl:latest
```

### Configuration via config.yaml
You can use the [config.yaml](config.yaml) file for more advanced configurations, such as Docker Compose or Helm chart deployments.

# Honeypot
Below is a complete overview of the Krawl honeypot’s capabilities

## robots.txt
The actual (juicy) robots.txt configuration [is the following](src/templates/html/robots.txt).

## Honeypot pages

### Common Login Attempts
Requests to common admin endpoints (`/admin/`, `/wp-admin/`, `/phpMyAdmin/`) return a fake login page. Any login attempt triggers a 1-second delay to simulate real processing and is fully logged in the dashboard (credentials, IP, headers, timing).

![admin page](img/admin-page.png)

### Common Misconfiguration Paths
Requests to paths like `/backup/`, `/config/`, `/database/`, `/private/`, or `/uploads/` return a fake directory listing populated with “interesting” files, each assigned a random file size to look realistic.

![directory-page](img/directory-page.png)

### Environment File Leakage
The `.env` endpoint exposes fake database connection strings, **AWS API keys**, and **Stripe secrets**. It intentionally returns an error due to the `Content-Type` being `application/json` instead of plain text, mimicking a "juicy" misconfiguration that crawlers and scanners often flag as information leakage.

### Server Error Information
The `/server` page displays randomly generated fake error information for each known server.

![server and env page](img/server-and-env-page.png)

### API Endpoints with Sensitive Data
The pages `/api/v1/users` and `/api/v2/secrets` show fake users and random secrets in JSON format

![users and secrets](img/users-and-secrets.png)

### Exposed Credential Files
The pages `/credentials.txt` and `/passwords.txt` show fake users and random secrets

![credentials and passwords](img/credentials-and-passwords.png)

### SQL Injection and XSS Detection
Pages such as `/users`, `/search`, `/contact`, `/info`, `/input`, and `/feedback`, along with APIs like `/api/sql` and `/api/database`, are designed to lure attackers into performing attacks such as **SQL injection** or **XSS**.

![sql injection](img/sql_injection.png)

Automated tools like **SQLMap** will receive a different randomized database error on each request, increasing scan noise and confusing the attacker. All detected attacks are logged and displayed in the dashboard.

### Path Traversal Detection
Krawl detects and responds to **path traversal** attempts targeting common system files like `/etc/passwd`, `/etc/shadow`, or Windows system paths. When an attacker tries to access sensitive files using patterns like `../../../etc/passwd` or encoded variants (`%2e%2e/`, `%252e`), Krawl returns convincing fake file contents with realistic system users, UIDs, GIDs, and shell configurations. This wastes attacker time while logging the full attack pattern.

### XXE (XML External Entity) Injection
The `/api/xml` and `/api/parser` endpoints accept XML input and are designed to detect **XXE injection** attempts. When attackers try to exploit external entity declarations (`<!ENTITY`, `<!DOCTYPE`, `SYSTEM`) or reference entities to access local files, Krawl responds with realistic XML responses that appear to process the entities successfully. The honeypot returns fake file contents, simulated entity values (like `admin_credentials` or `database_connection`), or realistic error messages, making the attack appear successful while fully logging the payload.

### Command Injection Detection
Pages like `/api/exec`, `/api/run`, and `/api/system` simulate command execution endpoints vulnerable to **command injection**. When attackers attempt to inject shell commands using patterns like `; whoami`, `| cat /etc/passwd`, or backticks, Krawl responds with realistic command outputs. For example, `whoami` returns fake usernames like `www-data` or `nginx`, while `uname` returns fake Linux kernel versions. Network commands like `wget` or `curl` simulate downloads or return "command not found" errors, creating believable responses that delay and confuse automated exploitation tools.
## Example usage behind reverse proxy

You can configure a reverse proxy so all web requests land on the Krawl page by default, and hide your real content behind a secret hidden url. For example:

```bash
location / {
    proxy_pass https://your-krawl-instance;
    proxy_pass_header Server;
}

location /my-hidden-service {
    proxy_pass https://my-hidden-service;
    proxy_pass_header Server;
}
```

Alternatively, you can create a bunch of different "interesting" looking domains. For example:

- admin.example.com
- portal.example.com
- sso.example.com
- login.example.com
- ...

Additionally, you may configure your reverse proxy to forward all non-existing subdomains (e.g. nonexistent.example.com) to one of these domains so that any crawlers that are guessing domains at random will automatically end up at your Krawl instance.

## Enable database dump job for backups

To enable the database dump job, set the following variables (*config file example*)

```yaml
backups:
    path: "backups" # where backup will be saved
    cron: "*/30 * * * *" # frequency of the cronjob
    enabled: true
```


## Customizing the Canary Token

To create a custom canary token, visit https://canarytokens.org

and generate a “Web bug” canary token.

This optional token is triggered when a crawler fully traverses the webpage until it reaches 0. At that point, a URL is returned. When this URL is requested, it sends an alert to the user via email, including the visitor’s IP address and user agent.


To enable this feature, set the canary token URL [using the environment variable](#configuration-via-environment-variables) `KRAWL_CANARY_TOKEN_URL`.

## Customizing the wordlist

Edit `wordlists.json` to customize fake data for your use case

```json
{
  "usernames": {
    "prefixes": ["admin", "root", "user"],
    "suffixes": ["_prod", "_dev", "123"]
  },
  "passwords": {
    "prefixes": ["P@ssw0rd", "Admin"],
    "simple": ["test", "password"]
  },
  "directory_listing": {
    "files": ["credentials.txt", "backup.sql"],
    "directories": ["admin/", "backup/"]
  }
}
```

or **values.yaml** in the case of helm chart installation

## Dashboard

Access the dashboard at `http://<server-ip>:<port>/<dashboard-path>`

The dashboard shows:
- Total and unique accesses
- Suspicious activity and attack detection
- Top IPs, paths, user-agents and GeoIP localization
- Real-time monitoring

The attackers’ access to the honeypot endpoint and related suspicious activities (such as failed login attempts) are logged.

Krawl also implements a scoring system designed to distinguish between malicious and legitimate behavior on the website.

![dashboard-1](img/dashboard-1.png)

The top IP Addresses is shown along with top paths and User Agents

![dashboard-2](img/dashboard-2.png)

![dashboard-3](img/dashboard-3.png)

## 🤝 Contributing

Contributions welcome! Please:
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Submit a pull request (explain the changes!)


<div align="center">

## ⚠️ Disclaimer

**This is a deception/honeypot system.**
Deploy in isolated environments and monitor carefully for security events.
Use responsibly and in compliance with applicable laws and regulations.

## Star History
<img src="https://api.star-history.com/svg?repos=BlessedRebuS/Krawl&type=Date" width="600" alt="Star History Chart" />
-												Feat/attack map improvement (#57)

* feat: enhance IP reputation management with city data and geolocation integration

* feat: enhance dashboard with city coordinates and improved marker handling

* feat: update chart version to 0.2.1 in Chart.yaml, README.md, and values.yaml

* feat: update logo format and size in README.md

* feat: improve location display logic in dashboard for attackers and IPs
											
										
										
											2026-01-27 16:56:34 +01:00
+								<h1 align="center">Krawl</h1>
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
 								<h3 align="center">
 								  <a name="readme-top"></a>
 								  <img
-												Feat/attack map improvement (#57)

* feat: enhance IP reputation management with city data and geolocation integration

* feat: enhance dashboard with city coordinates and improved marker handling

* feat: update chart version to 0.2.1 in Chart.yaml, README.md, and values.yaml

* feat: update logo format and size in README.md

* feat: improve location display logic in dashboard for attackers and IPs
											
										
										
											2026-01-27 16:56:34 +01:00
+								    src="img/krawl-svg.svg"
 								    height="250"
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								  >
 								</h3>
 								<div align="center">
 								<p align="center">
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								  A modern, customizable web honeypot server designed to detect and track malicious activity from attackers and web crawlers through deceptive web pages, fake credentials, and canary tokens.
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								</p>
 								<div align="center">
 								  <a href="https://github.com/blessedrebus/krawl/blob/main/LICENSE">
 								    <img src="https://img.shields.io/github/license/blessedrebus/krawl" alt="License">
 								  </a>
 								  <a href="https://github.com/blessedrebus/krawl/releases">
 								    <img src="https://img.shields.io/github/v/release/blessedrebus/krawl" alt="Release">
 								  </a>
 								</div>
 								<div align="center">
 								  <a href="https://ghcr.io/blessedrebus/krawl">
 								    <img src="https://img.shields.io/badge/ghcr.io-krawl-blue" alt="GitHub Container Registry">
 								  </a>
 								  <a href="https://kubernetes.io/">
 								    <img src="https://img.shields.io/badge/kubernetes-ready-326CE5?logo=kubernetes&logoColor=white" alt="Kubernetes">
 								  </a>
 								  <a href="https://github.com/BlessedRebuS/Krawl/pkgs/container/krawl-chart">
 								    <img src="https://img.shields.io/badge/helm-chart-0F1689?logo=helm&logoColor=white" alt="Helm Chart">
 								  </a>
 								</div>
 								<br>
 								<p align="center">
 								  <a href="#what-is-krawl">What is Krawl?</a> •
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
+								  <a href="#-installation">Installation</a> •
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								  <a href="#honeypot-pages">Honeypot Pages</a> •
 								  <a href="#dashboard">Dashboard</a> •
 								  <a href="./ToDo.md">Todo</a> •
 								  <a href="#-contributing">Contributing</a>
 								</p>
 								<br>
 								</div>
 								## Demo
 								Tip: crawl the `robots.txt` paths for additional fun
 								### Krawl URL: [http://demo.krawlme.com](http://demo.krawlme.com)
 								### View the dashboard [http://demo.krawlme.com/das_dashboard](http://demo.krawlme.com/das_dashboard)
 								## What is Krawl?
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								**Krawl** is a cloud‑native deception server designed to detect, delay, and analyze malicious attackers, web crawlers and automated scanners.
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
 								It creates realistic fake web applications filled with low‑hanging fruit such as admin panels, configuration files, and exposed fake credentials to attract and identify suspicious activity.
 								By wasting attacker resources, Krawl helps clearly distinguish malicious behavior from legitimate crawlers.
 								It features:
 								- **Spider Trap Pages**: Infinite random links to waste crawler resources based on the [spidertrap project](https://github.com/adhdproject/spidertrap)
 								- **Fake Login Pages**: WordPress, phpMyAdmin, admin panels
 								- **Honeypot Paths**: Advertised in robots.txt to catch scanners
 								- **Fake Credentials**: Realistic-looking usernames, passwords, API keys
 								- **[Canary Token](#customizing-the-canary-token) Integration**: External alert triggering
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								- **Random server headers**: Confuse attacks based on server header and version
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								- **Real-time Dashboard**: Monitor suspicious activity
 								- **Customizable Wordlists**: Easy JSON-based configuration
 								- **Random Error Injection**: Mimic real server behavior
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								![dashboard](img/deception-page.png)
 								![geoip](img/geoip_dashboard.png)
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
+								## 🚀 Installation
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
+								### Docker Run
 								Run Krawl with the latest image:
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
 								```bash
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
+								docker run -d \
 								  -p 5000:5000 \
 								  -e KRAWL_PORT=5000 \
 								  -e KRAWL_DELAY=100 \
 								  -e KRAWL_DASHBOARD_SECRET_PATH="/my-secret-dashboard" \
 								  -e KRAWL_DATABASE_RETENTION_DAYS=30 \
 								  --name krawl \
 								  ghcr.io/blessedrebus/krawl:latest
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								```
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
+								Access the server at `http://localhost:5000`
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
+								### Docker Compose
 								Create a `docker-compose.yaml` file:
 								```yaml
 								services:
 								  krawl:
 								    image: ghcr.io/blessedrebus/krawl:latest
 								    container_name: krawl-server
 								    ports:
 								      - "5000:5000"
 								    environment:
 								      - CONFIG_LOCATION=config.yaml
-												Fixed TZ variable in example compose file.
Fixed CANARY_TOKEN_URL variable in the example.
Defined an example of how to use Krawl behind a reverse proxy.

											
										
										
											2026-02-16 20:30:49 +01:00
+								      - TZ=Europe/Rome
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
+								    volumes:
 								      - ./config.yaml:/app/config.yaml:ro
-												added configuration variable documentation and filename documentation

											
										
										
											2026-02-02 14:54:36 +01:00
+								      # bind mount for firewall exporters
 								      - ./exports:/app/exports
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
+								      - krawl-data:/app/data
 								    restart: unless-stopped
 								volumes:
 								  krawl-data:
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								```
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
+								Run with:
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
 								```bash
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
+								docker-compose up -d
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								```
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
+								Stop with:
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
+								```bash
 								docker-compose down
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								```
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
+								### Kubernetes
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								**Krawl is also available natively on Kubernetes**. Installation can be done either [via manifest](kubernetes/README.md) or [using the helm chart](helm/README.md).
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								## Use Krawl to Ban Malicious IPs
 								Krawl uses a reputation-based system to classify attacker IP addresses. Every five minutes, Krawl exports the identified malicious IPs to a `malicious_ips.txt` file.
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								This file can either be mounted from the Docker container into another system or downloaded directly via `curl`:
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
 								```bash
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								curl https://your-krawl-instance/<DASHBOARD-PATH>/api/download/malicious_ips.txt
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								```
-												Updated README for clarity and RouterOS guide
											
										
										
											2026-01-31 23:01:11 +01:00
+								This file enables automatic blocking of malicious traffic across various platforms. You can use it to update firewall rules on:
 								* [OPNsense and pfSense](https://www.allthingstech.ch/using-opnsense-and-ip-blocklists-to-block-malicious-traffic)
 								* [RouterOS](https://rentry.co/krawl-routeros)
-												added iptables and nftables integration

											
										
										
											2026-02-23 01:23:49 +01:00
+								* [IPtables](plugins/iptables/README.md) and [Nftables](plugins/nftables/README.md)
 								* [Fail2Ban](plugins/fail2ban/README.md)
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								## IP Reputation
 								Krawl [uses tasks that analyze recent traffic to build and continuously update an IP reputation](src/tasks/analyze_ips.py) score. It runs periodically and evaluates each active IP address based on multiple behavioral indicators to classify it as an attacker, crawler, or regular user. Thresholds are fully customizable.
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								![ip reputation](img/ip-reputation.png)
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								The analysis includes:
 								- **Risky HTTP methods usage** (e.g. POST, PUT, DELETE ratios)
 								- **Robots.txt violations**
 								- **Request timing anomalies** (bursty or irregular patterns)
 								- **User-Agent consistency**
 								- **Attack URL detection** (e.g. SQL injection, XSS patterns)
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								Each signal contributes to a weighted scoring model that assigns a reputation category:
 								- `attacker`
 								- `bad_crawler`
 								- `good_crawler`
 								- `regular_user`
 								- `unknown` (for insufficient data)
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								The resulting scores and metrics are stored in the database and used by Krawl to drive dashboards, reputation tracking, and automated mitigation actions such as IP banning or firewall integration.
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								## Forward server header
 								If Krawl is deployed behind a proxy such as NGINX the **server header** should be forwarded using the following configuration in your proxy:
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								```bash
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								location / {
 								    proxy_pass https://your-krawl-instance;
 								    proxy_pass_header Server;
 								}
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								```
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								## API
 								Krawl uses the following APIs
-												modified dashboard, added ip-api data fetch

											
										
										
											2026-02-01 22:43:12 +01:00
+								- http://ip-api.com (IP Data)
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								- https://iprep.lcrawl.com (IP Reputation)
 								- https://nominatim.openstreetmap.org/reverse (Reverse IP Lookup)
 								- https://api.ipify.org (Public IP discovery)
 								- http://ident.me (Public IP discovery)
 								- https://ifconfig.me (Public IP discovery)
 								## Configuration
 								Krawl uses a **configuration hierarchy** in which **environment variables take precedence over the configuration file**. This approach is recommended for Docker deployments and quick out-of-the-box customization.
 								### Configuration via Enviromental Variables
 								| Environment Variable | Description | Default |
 								|----------------------|-------------|---------|
 								| `CONFIG_LOCATION` | Path to yaml config file | `config.yaml` |
 								| `KRAWL_PORT` | Server listening port | `5000` |
 								| `KRAWL_DELAY` | Response delay in milliseconds | `100` |
 								| `KRAWL_SERVER_HEADER` | HTTP Server header for deception | `""` |
 								| `KRAWL_LINKS_LENGTH_RANGE` | Link length range as `min,max` | `5,15` |
 								| `KRAWL_LINKS_PER_PAGE_RANGE` | Links per page as `min,max` | `10,15` |
 								| `KRAWL_CHAR_SPACE` | Characters used for link generation | `abcdefgh...` |
 								| `KRAWL_MAX_COUNTER` | Initial counter value | `10` |
 								| `KRAWL_CANARY_TOKEN_URL` | External canary token URL | None |
 								| `KRAWL_CANARY_TOKEN_TRIES` | Requests before showing canary token | `10` |
 								| `KRAWL_DASHBOARD_SECRET_PATH` | Custom dashboard path | Auto-generated |
 								| `KRAWL_PROBABILITY_ERROR_CODES` | Error response probability (0-100%) | `0` |
 								| `KRAWL_DATABASE_PATH` | Database file location | `data/krawl.db` |
-												added configuration variable documentation and filename documentation

											
										
										
											2026-02-02 14:54:36 +01:00
+								| `KRAWL_EXPORTS_PATH` | Path where firewalls rule sets are exported | `exports` |
-												updated cron with configuration variables

											
										
										
											2026-02-05 17:57:29 +01:00
+								| `KRAWL_BACKUPS_PATH` | Path where database dump are saved | `backups` |
 								| `KRAWL_BACKUPS_CRON` | cron expression to control backup job schedule | `*/30 * * * *` |
-												added parameter in config file to disable backup job

											
										
										
											2026-02-22 16:01:39 +01:00
+								| `KRAWL_BACKUPS_ENABLED` | Boolean to enable db dump job | `true` |
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								| `KRAWL_DATABASE_RETENTION_DAYS` | Days to retain data in database | `30` |
 								| `KRAWL_HTTP_RISKY_METHODS_THRESHOLD` | Threshold for risky HTTP methods detection | `0.1` |
 								| `KRAWL_VIOLATED_ROBOTS_THRESHOLD` | Threshold for robots.txt violations | `0.1` |
 								| `KRAWL_UNEVEN_REQUEST_TIMING_THRESHOLD` | Coefficient of variation threshold for timing | `0.5` |
 								| `KRAWL_UNEVEN_REQUEST_TIMING_TIME_WINDOW_SECONDS` | Time window for request timing analysis in seconds | `300` |
 								| `KRAWL_USER_AGENTS_USED_THRESHOLD` | Threshold for detecting multiple user agents | `2` |
 								| `KRAWL_ATTACK_URLS_THRESHOLD` | Threshold for attack URL detection | `1` |
-												Feat/release 1.0.0 (#63)

* Feat: update Kubernetes manifests for Krawl deployment and improve resource labels

* Feat: update version to 1.0.0 in Helm chart and related files; add timezone to README

* Feat: enhance configuration options for handling malicious IPs and update dashboard secret path

* Fix: standardize boolean value handling in environment configuration
											
										
										
											2026-01-29 14:32:10 +01:00
+								| `KRAWL_INFINITE_PAGES_FOR_MALICIOUS` | Serve infinite pages to malicious IPs | `true` |
 								| `KRAWL_MAX_PAGES_LIMIT` | Maximum page limit for crawlers | `250` |
 								| `KRAWL_BAN_DURATION_SECONDS` | Ban duration in seconds for rate-limited IPs | `600` |
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
 								For example
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
 								```bash
 								# Set canary token
-												added configuration variable documentation and filename documentation

											
										
										
											2026-02-02 14:54:36 +01:00
+								export CONFIG_LOCATION="config.yaml"
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								export KRAWL_CANARY_TOKEN_URL="http://your-canary-token-url"
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								# Set number of pages range (min,max format)
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								export KRAWL_LINKS_PER_PAGE_RANGE="5,25"
 								# Set analyzer thresholds
 								export KRAWL_HTTP_RISKY_METHODS_THRESHOLD="0.2"
 								export KRAWL_VIOLATED_ROBOTS_THRESHOLD="0.15"
 								# Set custom dashboard path
 								export KRAWL_DASHBOARD_SECRET_PATH="/my-secret-dashboard"
 								```
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								Example of a Docker run with env variables:
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
 								```bash
 								docker run -d \
 								  -p 5000:5000 \
 								  -e KRAWL_PORT=5000 \
 								  -e KRAWL_DELAY=100 \
 								  -e KRAWL_CANARY_TOKEN_URL="http://your-canary-token-url" \
 								  --name krawl \
 								  ghcr.io/blessedrebus/krawl:latest
 								```
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								### Configuration via config.yaml
 								You can use the [config.yaml](config.yaml) file for more advanced configurations, such as Docker Compose or Helm chart deployments.
 								# Honeypot
 								Below is a complete overview of the Krawl honeypot’s capabilities
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								## robots.txt
-												added configuration variable documentation and filename documentation

											
										
										
											2026-02-02 14:54:36 +01:00
+								The actual (juicy) robots.txt configuration [is the following](src/templates/html/robots.txt).
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
 								## Honeypot pages
-												added raw request handling, enanched attack detection for GET and POSTS, templatized suspicioius activity to fetch from wordlists.json, aligned helm to load new wordlist config, added migration scripts from 1.0.0 to new krawl versions, removed old and unused functions, added test scripts

											
										
										
											2026-02-08 16:02:18 +01:00
 								### Common Login Attempts
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								Requests to common admin endpoints (`/admin/`, `/wp-admin/`, `/phpMyAdmin/`) return a fake login page. Any login attempt triggers a 1-second delay to simulate real processing and is fully logged in the dashboard (credentials, IP, headers, timing).
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								![admin page](img/admin-page.png)
-												added raw request handling, enanched attack detection for GET and POSTS, templatized suspicioius activity to fetch from wordlists.json, aligned helm to load new wordlist config, added migration scripts from 1.0.0 to new krawl versions, removed old and unused functions, added test scripts

											
										
										
											2026-02-08 16:02:18 +01:00
+								### Common Misconfiguration Paths
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								Requests to paths like `/backup/`, `/config/`, `/database/`, `/private/`, or `/uploads/` return a fake directory listing populated with “interesting” files, each assigned a random file size to look realistic.
 								![directory-page](img/directory-page.png)
-												added raw request handling, enanched attack detection for GET and POSTS, templatized suspicioius activity to fetch from wordlists.json, aligned helm to load new wordlist config, added migration scripts from 1.0.0 to new krawl versions, removed old and unused functions, added test scripts

											
										
										
											2026-02-08 16:02:18 +01:00
+								### Environment File Leakage
 								The `.env` endpoint exposes fake database connection strings, **AWS API keys**, and **Stripe secrets**. It intentionally returns an error due to the `Content-Type` being `application/json` instead of plain text, mimicking a "juicy" misconfiguration that crawlers and scanners often flag as information leakage.
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
-												added raw request handling, enanched attack detection for GET and POSTS, templatized suspicioius activity to fetch from wordlists.json, aligned helm to load new wordlist config, added migration scripts from 1.0.0 to new krawl versions, removed old and unused functions, added test scripts

											
										
										
											2026-02-08 16:02:18 +01:00
+								### Server Error Information
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								The `/server` page displays randomly generated fake error information for each known server.
 								![server and env page](img/server-and-env-page.png)
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
-												added raw request handling, enanched attack detection for GET and POSTS, templatized suspicioius activity to fetch from wordlists.json, aligned helm to load new wordlist config, added migration scripts from 1.0.0 to new krawl versions, removed old and unused functions, added test scripts

											
										
										
											2026-02-08 16:02:18 +01:00
+								### API Endpoints with Sensitive Data
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								The pages `/api/v1/users` and `/api/v2/secrets` show fake users and random secrets in JSON format
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								![users and secrets](img/users-and-secrets.png)
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
-												added raw request handling, enanched attack detection for GET and POSTS, templatized suspicioius activity to fetch from wordlists.json, aligned helm to load new wordlist config, added migration scripts from 1.0.0 to new krawl versions, removed old and unused functions, added test scripts

											
										
										
											2026-02-08 16:02:18 +01:00
+								### Exposed Credential Files
-												added configuration variable documentation and filename documentation

											
										
										
											2026-02-02 14:54:36 +01:00
+								The pages `/credentials.txt` and `/passwords.txt` show fake users and random secrets
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								![credentials and passwords](img/credentials-and-passwords.png)
-												added raw request handling, enanched attack detection for GET and POSTS, templatized suspicioius activity to fetch from wordlists.json, aligned helm to load new wordlist config, added migration scripts from 1.0.0 to new krawl versions, removed old and unused functions, added test scripts

											
										
										
											2026-02-08 16:02:18 +01:00
+								### SQL Injection and XSS Detection
-												added configuration variable documentation and filename documentation

											
										
										
											2026-02-02 14:54:36 +01:00
+								Pages such as `/users`, `/search`, `/contact`, `/info`, `/input`, and `/feedback`, along with APIs like `/api/sql` and `/api/database`, are designed to lure attackers into performing attacks such as **SQL injection** or **XSS**.
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
 								![sql injection](img/sql_injection.png)
 								Automated tools like **SQLMap** will receive a different randomized database error on each request, increasing scan noise and confusing the attacker. All detected attacks are logged and displayed in the dashboard.
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
-												added raw request handling, enanched attack detection for GET and POSTS, templatized suspicioius activity to fetch from wordlists.json, aligned helm to load new wordlist config, added migration scripts from 1.0.0 to new krawl versions, removed old and unused functions, added test scripts

											
										
										
											2026-02-08 16:02:18 +01:00
+								### Path Traversal Detection
 								Krawl detects and responds to **path traversal** attempts targeting common system files like `/etc/passwd`, `/etc/shadow`, or Windows system paths. When an attacker tries to access sensitive files using patterns like `../../../etc/passwd` or encoded variants (`%2e%2e/`, `%252e`), Krawl returns convincing fake file contents with realistic system users, UIDs, GIDs, and shell configurations. This wastes attacker time while logging the full attack pattern.
 								### XXE (XML External Entity) Injection
 								The `/api/xml` and `/api/parser` endpoints accept XML input and are designed to detect **XXE injection** attempts. When attackers try to exploit external entity declarations (`<!ENTITY`, `<!DOCTYPE`, `SYSTEM`) or reference entities to access local files, Krawl responds with realistic XML responses that appear to process the entities successfully. The honeypot returns fake file contents, simulated entity values (like `admin_credentials` or `database_connection`), or realistic error messages, making the attack appear successful while fully logging the payload.
 								### Command Injection Detection
 								Pages like `/api/exec`, `/api/run`, and `/api/system` simulate command execution endpoints vulnerable to **command injection**. When attackers attempt to inject shell commands using patterns like `; whoami`, `| cat /etc/passwd`, or backticks, Krawl responds with realistic command outputs. For example, `whoami` returns fake usernames like `www-data` or `nginx`, while `uname` returns fake Linux kernel versions. Network commands like `wget` or `curl` simulate downloads or return "command not found" errors, creating believable responses that delay and confuse automated exploitation tools.
-												Fixed TZ variable in example compose file.
Fixed CANARY_TOKEN_URL variable in the example.
Defined an example of how to use Krawl behind a reverse proxy.

											
										
										
											2026-02-16 20:30:49 +01:00
+								## Example usage behind reverse proxy
 								You can configure a reverse proxy so all web requests land on the Krawl page by default, and hide your real content behind a secret hidden url. For example:
 								```bash
 								location / {
 								    proxy_pass https://your-krawl-instance;
 								    proxy_pass_header Server;
 								}
 								location /my-hidden-service {
 								    proxy_pass https://my-hidden-service;
 								    proxy_pass_header Server;
 								}
 								```
 								Alternatively, you can create a bunch of different "interesting" looking domains. For example:
 								- admin.example.com
 								- portal.example.com
 								- sso.example.com
 								- login.example.com
 								- ...
 								Additionally, you may configure your reverse proxy to forward all non-existing subdomains (e.g. nonexistent.example.com) to one of these domains so that any crawlers that are guessing domains at random will automatically end up at your Krawl instance.
-												added raw request handling, enanched attack detection for GET and POSTS, templatized suspicioius activity to fetch from wordlists.json, aligned helm to load new wordlist config, added migration scripts from 1.0.0 to new krawl versions, removed old and unused functions, added test scripts

											
										
										
											2026-02-08 16:02:18 +01:00
-												added parameter in config file to disable backup job

											
										
										
											2026-02-22 16:01:39 +01:00
+								## Enable database dump job for backups
 								To enable the database dump job, set the following variables (*config file example*)
 								```yaml
 								backups:
 								    path: "backups" # where backup will be saved
 								    cron: "*/30 * * * *" # frequency of the cronjob
 								    enabled: true
 								```
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								## Customizing the Canary Token
-												added parameter in config file to disable backup job

											
										
										
											2026-02-22 16:01:39 +01:00
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								To create a custom canary token, visit https://canarytokens.org
 								and generate a “Web bug” canary token.
 								This optional token is triggered when a crawler fully traverses the webpage until it reaches 0. At that point, a URL is returned. When this URL is requested, it sends an alert to the user via email, including the visitor’s IP address and user agent.
-												Fixed TZ variable in example compose file.
Fixed CANARY_TOKEN_URL variable in the example.
Defined an example of how to use Krawl behind a reverse proxy.

											
										
										
											2026-02-16 20:30:49 +01:00
+								To enable this feature, set the canary token URL [using the environment variable](#configuration-via-environment-variables) `KRAWL_CANARY_TOKEN_URL`.
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
-												added configuration variable documentation and filename documentation

											
										
										
											2026-02-02 14:54:36 +01:00
+								## Customizing the wordlist
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
 								Edit `wordlists.json` to customize fake data for your use case
 								```json
 								{
 								  "usernames": {
 								    "prefixes": ["admin", "root", "user"],
 								    "suffixes": ["_prod", "_dev", "123"]
 								  },
 								  "passwords": {
 								    "prefixes": ["P@ssw0rd", "Admin"],
 								    "simple": ["test", "password"]
 								  },
 								  "directory_listing": {
 								    "files": ["credentials.txt", "backup.sql"],
 								    "directories": ["admin/", "backup/"]
 								  }
 								}
 								```
 								or **values.yaml** in the case of helm chart installation
 								## Dashboard
 								Access the dashboard at `http://<server-ip>:<port>/<dashboard-path>`
 								The dashboard shows:
 								- Total and unique accesses
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								- Suspicious activity and attack detection
 								- Top IPs, paths, user-agents and GeoIP localization
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								- Real-time monitoring
-												added configuration variable documentation and filename documentation

											
										
										
											2026-02-02 14:54:36 +01:00
+								The attackers’ access to the honeypot endpoint and related suspicious activities (such as failed login attempts) are logged.
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
 								Krawl also implements a scoring system designed to distinguish between malicious and legitimate behavior on the website.
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
 								![dashboard-1](img/dashboard-1.png)
 								The top IP Addresses is shown along with top paths and User Agents
 								![dashboard-2](img/dashboard-2.png)
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								![dashboard-3](img/dashboard-3.png)
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
 								## 🤝 Contributing
 								Contributions welcome! Please:
 . Fork the repository
 . Create a feature branch
 . Make your changes
 . Submit a pull request (explain the changes!)
 								<div align="center">
 								## ⚠️ Disclaimer
-												added configuration variable documentation and filename documentation

											
										
										
											2026-02-02 14:54:36 +01:00
+								**This is a deception/honeypot system.**
 								Deploy in isolated environments and monitor carefully for security events.
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								Use responsibly and in compliance with applicable laws and regulations.
 								## Star History
 								<img src="https://api.star-history.com/svg?repos=BlessedRebuS/Krawl&type=Date" width="600" alt="Star History Chart" />
-												Updated README for clarity and RouterOS guide
											
										
										
											2026-01-31 23:01:11 +01:00