README.md

<h1 align="center">Krawl</h1>

<h3 align="center">
  <a name="readme-top"></a>
  <img
    src="img/krawl-svg.svg"
    height="250"
  >
</h3>
<div align="center">

<p align="center">
  A modern, customizable web honeypot server designed to detect and track malicious activity from attackers and web crawlers through deceptive web pages, fake credentials, and canary tokens.
</p>

<div align="center">
  <a href="https://github.com/blessedrebus/krawl/blob/main/LICENSE">
    <img src="https://img.shields.io/github/license/blessedrebus/krawl" alt="License">
  </a>
  <a href="https://github.com/blessedrebus/krawl/releases">
    <img src="https://img.shields.io/github/v/release/blessedrebus/krawl" alt="Release">
  </a>
</div>

<div align="center">
  <a href="https://ghcr.io/blessedrebus/krawl">
    <img src="https://img.shields.io/badge/ghcr.io-krawl-blue" alt="GitHub Container Registry">
  </a>
  <a href="https://kubernetes.io/">
    <img src="https://img.shields.io/badge/kubernetes-ready-326CE5?logo=kubernetes&logoColor=white" alt="Kubernetes">
  </a>
  <a href="https://github.com/BlessedRebuS/Krawl/pkgs/container/krawl-chart">
    <img src="https://img.shields.io/badge/helm-chart-0F1689?logo=helm&logoColor=white" alt="Helm Chart">
  </a>
</div>
</div>

## Table of Contents
- [Demo](#demo)
- [What is Krawl?](#what-is-krawl)
- [Krawl Dashboard](#krawl-dashboard)
- [Installation](#-installation)
  - [Docker Run](#docker-run)
  - [Docker Compose](#docker-compose)
  - [Kubernetes](#kubernetes)
  - [Local (Python)](#local-python)
- [Configuration](#configuration)
  - [config.yaml](#configuration-via-configyaml)
  - [Environment Variables](#configuration-via-enviromental-variables)
- [Ban Malicious IPs](#use-krawl-to-ban-malicious-ips)
- [IP Reputation](#ip-reputation)
- [Forward Server Header](#forward-server-header)
- [Additional Documentation](#additional-documentation)
- [Contributing](#-contributing)

## Demo
Tip: crawl the `robots.txt` paths for additional fun
### Krawl URL: [http://demo.krawlme.com](http://demo.krawlme.com)
### View the dashboard [http://demo.krawlme.com/das_dashboard](http://demo.krawlme.com/das_dashboard)

## What is Krawl?

**Krawl** is a cloud‑native deception server designed to detect, delay, and analyze malicious attackers, web crawlers and automated scanners.

It creates realistic fake web applications filled with low‑hanging fruit such as admin panels, configuration files, and exposed fake credentials to attract and identify suspicious activity.

![dashboard](img/deception-page.png)

By wasting attacker resources, Krawl helps clearly distinguish malicious behavior from legitimate crawlers.

It features:

- **Spider Trap Pages**: Infinite random links to waste crawler resources based on the [spidertrap project](https://github.com/adhdproject/spidertrap)
- **Fake Login Pages**: WordPress, phpMyAdmin, admin panels
- **Honeypot Paths**: Advertised in robots.txt to catch scanners
- **Fake Credentials**: Realistic-looking usernames, passwords, API keys
- **[Canary Token](docs/canary-token.md) Integration**: External alert triggering
- **Random server headers**: Confuse attacks based on server header and version
- **Real-time Dashboard**: Monitor suspicious activity
- **Customizable Wordlists**: Easy JSON-based configuration
- **Random Error Injection**: Mimic real server behavior

You can easily expose Krawl alongside your other services to shield them from web crawlers and malicious users using a reverse proxy. For more details, see the [Reverse Proxy documentation](docs/reverse-proxy.md).

![use case](img/use-case.png)

## Krawl Dashboard

Krawl provides a comprehensive dashboard, accessible at a **random secret path** generated at startup or at a **custom path** configured via `KRAWL_DASHBOARD_SECRET_PATH`. This keeps the dashboard hidden from attackers scanning your honeypot.

The dashboard is organized in three main tabs:

- **Overview** — High-level view of attack activity: an interactive map of IP origins, recent suspicious requests, and top IPs, User-Agents, and paths.

![geoip](img/geoip_dashboard.png)

- **Attacks** — Detailed breakdown of captured credentials, honeypot triggers, and detected attack types (SQLi, XSS, path traversal, etc.) with charts and tables.

![attack_types](img/attack_types.png)

- **IP Insight** — In-depth forensic view of a selected IP: geolocation, ISP/ASN info, reputation flags, behavioral timeline, attack type distribution, and full access history.

![ipinsight](img/ip_insight_dashboard.png)

For more details, see the [Dashboard documentation](docs/dashboard.md).


## 🚀 Installation

### Docker Run

Run Krawl with the latest image:

```bash
docker run -d \
  -p 5000:5000 \
  -e KRAWL_PORT=5000 \
  -e KRAWL_DELAY=100 \
  -e KRAWL_DASHBOARD_SECRET_PATH="/my-secret-dashboard" \
  -e KRAWL_DASHBOARD_PASSWORD="my-secret-password" \
  -v krawl-data:/app/data \
  --name krawl \
  ghcr.io/blessedrebus/krawl:latest
```

Access the server at `http://localhost:5000`

### Docker Compose

Create a `docker-compose.yaml` file:

```yaml
services:
  krawl:
    image: ghcr.io/blessedrebus/krawl:latest
    container_name: krawl-server
    ports:
      - "5000:5000"
    environment:
      - CONFIG_LOCATION=config.yaml
      - TZ=Europe/Rome
      # - KRAWL_DASHBOARD_PASSWORD=my-secret-password
    volumes:
      - ./config.yaml:/app/config.yaml:ro
      # bind mount for firewall exporters
      - ./exports:/app/exports
      - krawl-data:/app/data
    restart: unless-stopped

volumes:
  krawl-data:
```

Run with:

```bash
docker-compose up -d
```

Stop with:

```bash
docker-compose down
```

### Kubernetes
**Krawl is also available natively on Kubernetes**. Installation can be done either [via manifest](kubernetes/README.md) or [using the helm chart](helm/README.md).

### Python + Uvicorn

Run Krawl directly with Python (suggested version 13) and uvicorn for local development or testing:

```bash
pip install -r requirements.txt
uvicorn app:app --host 0.0.0.0 --port 5000 --app-dir src
```

Access the server at `http://localhost:5000`


## Configuration
Krawl uses a **configuration hierarchy** in which **environment variables take precedence over the configuration file**. This approach is recommended for Docker deployments and quick out-of-the-box customization.

### Configuration via config.yaml
You can use the [config.yaml](config.yaml) file for advanced configurations, such as Docker Compose or Helm chart deployments.

### Configuration via Enviromental Variables

| Environment Variable | Description | Default |
|----------------------|-------------|---------|
| `CONFIG_LOCATION` | Path to yaml config file | `config.yaml` |
| `KRAWL_PORT` | Server listening port | `5000` |
| `KRAWL_DELAY` | Response delay in milliseconds | `100` |
| `KRAWL_SERVER_HEADER` | HTTP Server header for deception | `""` |
| `KRAWL_LINKS_LENGTH_RANGE` | Link length range as `min,max` | `5,15` |
| `KRAWL_LINKS_PER_PAGE_RANGE` | Links per page as `min,max` | `10,15` |
| `KRAWL_CHAR_SPACE` | Characters used for link generation | `abcdefgh...` |
| `KRAWL_MAX_COUNTER` | Initial counter value | `10` |
| `KRAWL_CANARY_TOKEN_URL` | External canary token URL | None |
| `KRAWL_CANARY_TOKEN_TRIES` | Requests before showing canary token | `10` |
| `KRAWL_DASHBOARD_SECRET_PATH` | Custom dashboard path | Auto-generated |
| `KRAWL_DASHBOARD_PASSWORD` | Password for protected dashboard panels | Auto-generated |
| `KRAWL_PROBABILITY_ERROR_CODES` | Error response probability (0-100%) | `0` |
| `KRAWL_DATABASE_PATH` | Database file location | `data/krawl.db` |
| `KRAWL_EXPORTS_PATH` | Path where firewalls rule sets are exported | `exports` |
| `KRAWL_BACKUPS_PATH` | Path where database dump are saved | `backups` |
| `KRAWL_BACKUPS_CRON` | cron expression to control backup job schedule | `*/30 * * * *` |
| `KRAWL_BACKUPS_ENABLED` | Boolean to enable db dump job | `true` |
| `KRAWL_DATABASE_RETENTION_DAYS` | Days to retain data in database | `30` |
| `KRAWL_HTTP_RISKY_METHODS_THRESHOLD` | Threshold for risky HTTP methods detection | `0.1` |
| `KRAWL_VIOLATED_ROBOTS_THRESHOLD` | Threshold for robots.txt violations | `0.1` |
| `KRAWL_UNEVEN_REQUEST_TIMING_THRESHOLD` | Coefficient of variation threshold for timing | `0.5` |
| `KRAWL_UNEVEN_REQUEST_TIMING_TIME_WINDOW_SECONDS` | Time window for request timing analysis in seconds | `300` |
| `KRAWL_USER_AGENTS_USED_THRESHOLD` | Threshold for detecting multiple user agents | `2` |
| `KRAWL_ATTACK_URLS_THRESHOLD` | Threshold for attack URL detection | `1` |
| `KRAWL_INFINITE_PAGES_FOR_MALICIOUS` | Serve infinite pages to malicious IPs | `true` |
| `KRAWL_MAX_PAGES_LIMIT` | Maximum page limit for crawlers | `250` |
| `KRAWL_BAN_DURATION_SECONDS` | Ban duration in seconds for rate-limited IPs | `600` |

For example

```bash
# Set canary token
export CONFIG_LOCATION="config.yaml"
export KRAWL_CANARY_TOKEN_URL="http://your-canary-token-url"

# Set number of pages range (min,max format)
export KRAWL_LINKS_PER_PAGE_RANGE="5,25"

# Set analyzer thresholds
export KRAWL_HTTP_RISKY_METHODS_THRESHOLD="0.2"
export KRAWL_VIOLATED_ROBOTS_THRESHOLD="0.15"

# Set custom dashboard path and password
export KRAWL_DASHBOARD_SECRET_PATH="/my-secret-dashboard"
export KRAWL_DASHBOARD_PASSWORD="my-secret-password"
```

Example of a Docker run with env variables:

```bash
docker run -d \
  -p 5000:5000 \
  -e KRAWL_PORT=5000 \
  -e KRAWL_DELAY=100 \
  -e KRAWL_DASHBOARD_PASSWORD="my-secret-password" \
  -e KRAWL_CANARY_TOKEN_URL="http://your-canary-token-url" \
  --name krawl \
  ghcr.io/blessedrebus/krawl:latest
```

## Use Krawl to Ban Malicious IPs
Krawl uses a reputation-based system to classify attacker IP addresses. Every five minutes, Krawl exports the identified malicious IPs to a `malicious_ips.txt` file.

This file can either be mounted from the Docker container into another system or downloaded directly via `curl`:

```bash
curl https://your-krawl-instance/<DASHBOARD-PATH>/api/download/malicious_ips.txt
```

This file enables automatic blocking of malicious traffic across various platforms. You can use it to update firewall rules on:
* [OPNsense and pfSense](https://www.allthingstech.ch/using-opnsense-and-ip-blocklists-to-block-malicious-traffic)
* [RouterOS](https://rentry.co/krawl-routeros)
* [IPtables](plugins/iptables/README.md) and [Nftables](plugins/nftables/README.md)
* [Fail2Ban](plugins/fail2ban/README.md)

## IP Reputation
Krawl [uses tasks that analyze recent traffic to build and continuously update an IP reputation](src/tasks/analyze_ips.py) score. It runs periodically and evaluates each active IP address based on multiple behavioral indicators to classify it as an attacker, crawler, or regular user. Thresholds are fully customizable.

![ip reputation](img/ip-reputation.png)

The analysis includes:
- **Risky HTTP methods usage** (e.g. POST, PUT, DELETE ratios)
- **Robots.txt violations**
- **Request timing anomalies** (bursty or irregular patterns)
- **User-Agent consistency**
- **Attack URL detection** (e.g. SQL injection, XSS patterns)

Each signal contributes to a weighted scoring model that assigns a reputation category:
- `attacker`
- `bad_crawler`
- `good_crawler`
- `regular_user`
- `unknown` (for insufficient data)

The resulting scores and metrics are stored in the database and used by Krawl to drive dashboards, reputation tracking, and automated mitigation actions such as IP banning or firewall integration.

## Forward server header
If Krawl is deployed behind a proxy such as NGINX the **server header** should be forwarded using the following configuration in your proxy:

```bash
location / {
    proxy_pass https://your-krawl-instance;
    proxy_pass_header Server;
}
```

## Additional Documentation

| Topic | Description |
|-------|-------------|
| [API](docs/api.md) | External APIs used by Krawl for IP data, reputation, and geolocation |
| [Honeypot](docs/honeypot.md) | Full overview of honeypot pages: fake logins, directory listings, credential files, SQLi/XSS/XXE/command injection traps, and more |
| [Reverse Proxy](docs/reverse-proxy.md) | How to deploy Krawl behind NGINX or use decoy subdomains |
| [Database Backups](docs/backups.md) | Enable and configure the automatic database dump job |
| [Canary Token](docs/canary-token.md) | Set up external alert triggers via canarytokens.org |
| [Wordlist](docs/wordlist.md) | Customize fake usernames, passwords, and directory listings |
| [Dashboard](docs/dashboard.md) | Access and explore the real-time monitoring dashboard |

## 🤝 Contributing

Contributions welcome! Please:
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Submit a pull request (explain the changes!)


## Disclaimer
> [!CAUTION]
> This is a deception/honeypot system. Deploy in isolated environments and monitor carefully for security events. Use responsibly and in compliance with applicable laws and regulations.

## Star History
<img src="https://api.star-history.com/svg?repos=BlessedRebuS/Krawl&type=Date" width="600" alt="Star History Chart" />
-												Feat/attack map improvement (#57)

* feat: enhance IP reputation management with city data and geolocation integration

* feat: enhance dashboard with city coordinates and improved marker handling

* feat: update chart version to 0.2.1 in Chart.yaml, README.md, and values.yaml

* feat: update logo format and size in README.md

* feat: improve location display logic in dashboard for attackers and IPs
											
										
										
											2026-01-27 16:56:34 +01:00
+								<h1 align="center">Krawl</h1>
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
 								<h3 align="center">
 								  <a name="readme-top"></a>
 								  <img
-												Feat/attack map improvement (#57)

* feat: enhance IP reputation management with city data and geolocation integration

* feat: enhance dashboard with city coordinates and improved marker handling

* feat: update chart version to 0.2.1 in Chart.yaml, README.md, and values.yaml

* feat: update logo format and size in README.md

* feat: improve location display logic in dashboard for attackers and IPs
											
										
										
											2026-01-27 16:56:34 +01:00
+								    src="img/krawl-svg.svg"
 								    height="250"
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								  >
 								</h3>
 								<div align="center">
 								<p align="center">
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								  A modern, customizable web honeypot server designed to detect and track malicious activity from attackers and web crawlers through deceptive web pages, fake credentials, and canary tokens.
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								</p>
 								<div align="center">
 								  <a href="https://github.com/blessedrebus/krawl/blob/main/LICENSE">
 								    <img src="https://img.shields.io/github/license/blessedrebus/krawl" alt="License">
 								  </a>
 								  <a href="https://github.com/blessedrebus/krawl/releases">
 								    <img src="https://img.shields.io/github/v/release/blessedrebus/krawl" alt="Release">
 								  </a>
 								</div>
 								<div align="center">
 								  <a href="https://ghcr.io/blessedrebus/krawl">
 								    <img src="https://img.shields.io/badge/ghcr.io-krawl-blue" alt="GitHub Container Registry">
 								  </a>
 								  <a href="https://kubernetes.io/">
 								    <img src="https://img.shields.io/badge/kubernetes-ready-326CE5?logo=kubernetes&logoColor=white" alt="Kubernetes">
 								  </a>
 								  <a href="https://github.com/BlessedRebuS/Krawl/pkgs/container/krawl-chart">
 								    <img src="https://img.shields.io/badge/helm-chart-0F1689?logo=helm&logoColor=white" alt="Helm Chart">
 								  </a>
 								</div>
 								</div>
-												docs: update README with table of contents and dashboard details; refine common probes regex in values.yaml; add new IP insight dashboard image

											
										
										
											2026-03-01 21:09:01 +01:00
+								## Table of Contents
 								- [Demo](#demo)
 								- [What is Krawl?](#what-is-krawl)
 								- [Krawl Dashboard](#krawl-dashboard)
 								- [Installation](#-installation)
 								  - [Docker Run](#docker-run)
 								  - [Docker Compose](#docker-compose)
 								  - [Kubernetes](#kubernetes)
-												feat: update README and Helm chart for version 1.1.3 release

											
										
										
											2026-03-04 15:00:26 +01:00
+								  - [Local (Python)](#local-python)
-												docs: update README with table of contents and dashboard details; refine common probes regex in values.yaml; add new IP insight dashboard image

											
										
										
											2026-03-01 21:09:01 +01:00
+								- [Configuration](#configuration)
 								  - [config.yaml](#configuration-via-configyaml)
 								  - [Environment Variables](#configuration-via-enviromental-variables)
 								- [Ban Malicious IPs](#use-krawl-to-ban-malicious-ips)
 								- [IP Reputation](#ip-reputation)
 								- [Forward Server Header](#forward-server-header)
-												docs: add comprehensive documentation for API, backups, canary token, dashboard, honeypot, reverse proxy, and wordlist customization

											
										
										
											2026-03-01 21:20:33 +01:00
+								- [Additional Documentation](#additional-documentation)
-												docs: update README with table of contents and dashboard details; refine common probes regex in values.yaml; add new IP insight dashboard image

											
										
										
											2026-03-01 21:09:01 +01:00
+								- [Contributing](#-contributing)
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								## Demo
 								Tip: crawl the `robots.txt` paths for additional fun
 								### Krawl URL: [http://demo.krawlme.com](http://demo.krawlme.com)
 								### View the dashboard [http://demo.krawlme.com/das_dashboard](http://demo.krawlme.com/das_dashboard)
 								## What is Krawl?
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								**Krawl** is a cloud‑native deception server designed to detect, delay, and analyze malicious attackers, web crawlers and automated scanners.
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
 								It creates realistic fake web applications filled with low‑hanging fruit such as admin panels, configuration files, and exposed fake credentials to attract and identify suspicious activity.
-												feat: update README and Helm chart for version 1.1.3 release

											
										
										
											2026-03-04 15:00:26 +01:00
+								![dashboard](img/deception-page.png)
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								By wasting attacker resources, Krawl helps clearly distinguish malicious behavior from legitimate crawlers.
 								It features:
 								- **Spider Trap Pages**: Infinite random links to waste crawler resources based on the [spidertrap project](https://github.com/adhdproject/spidertrap)
 								- **Fake Login Pages**: WordPress, phpMyAdmin, admin panels
 								- **Honeypot Paths**: Advertised in robots.txt to catch scanners
 								- **Fake Credentials**: Realistic-looking usernames, passwords, API keys
-												docs: add comprehensive documentation for API, backups, canary token, dashboard, honeypot, reverse proxy, and wordlist customization

											
										
										
											2026-03-01 21:20:33 +01:00
+								- **[Canary Token](docs/canary-token.md) Integration**: External alert triggering
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								- **Random server headers**: Confuse attacks based on server header and version
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								- **Real-time Dashboard**: Monitor suspicious activity
 								- **Customizable Wordlists**: Easy JSON-based configuration
 								- **Random Error Injection**: Mimic real server behavior
-												feat: update README and Helm chart for version 1.1.3 release

											
										
										
											2026-03-04 15:00:26 +01:00
+								You can easily expose Krawl alongside your other services to shield them from web crawlers and malicious users using a reverse proxy. For more details, see the [Reverse Proxy documentation](docs/reverse-proxy.md).
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
-												feat: update README and Helm chart for version 1.1.3 release

											
										
										
											2026-03-04 15:00:26 +01:00
+								![use case](img/use-case.png)
-												docs: update README with table of contents and dashboard details; refine common probes regex in values.yaml; add new IP insight dashboard image

											
										
										
											2026-03-01 21:09:01 +01:00
 								## Krawl Dashboard
 								Krawl provides a comprehensive dashboard, accessible at a **random secret path** generated at startup or at a **custom path** configured via `KRAWL_DASHBOARD_SECRET_PATH`. This keeps the dashboard hidden from attackers scanning your honeypot.
 								The dashboard is organized in three main tabs:
 								- **Overview** — High-level view of attack activity: an interactive map of IP origins, recent suspicious requests, and top IPs, User-Agents, and paths.
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								![geoip](img/geoip_dashboard.png)
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
-												docs: enhance dashboard section with detailed descriptions and add attack types image

											
										
										
											2026-03-01 21:26:53 +01:00
+								- **Attacks** — Detailed breakdown of captured credentials, honeypot triggers, and detected attack types (SQLi, XSS, path traversal, etc.) with charts and tables.
 								![attack_types](img/attack_types.png)
 								- **IP Insight** — In-depth forensic view of a selected IP: geolocation, ISP/ASN info, reputation flags, behavioral timeline, attack type distribution, and full access history.
-												docs: update README with table of contents and dashboard details; refine common probes regex in values.yaml; add new IP insight dashboard image

											
										
										
											2026-03-01 21:09:01 +01:00
+								![ipinsight](img/ip_insight_dashboard.png)
-												docs: enhance dashboard section with detailed descriptions and add attack types image

											
										
										
											2026-03-01 21:26:53 +01:00
+								For more details, see the [Dashboard documentation](docs/dashboard.md).
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
+								## 🚀 Installation
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
+								### Docker Run
 								Run Krawl with the latest image:
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
 								```bash
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
+								docker run -d \
 								  -p 5000:5000 \
 								  -e KRAWL_PORT=5000 \
 								  -e KRAWL_DELAY=100 \
 								  -e KRAWL_DASHBOARD_SECRET_PATH="/my-secret-dashboard" \
-												feat: add dashboard password configuration to README and docker-compose

											
										
										
											2026-03-06 22:40:16 +01:00
+								  -e KRAWL_DASHBOARD_PASSWORD="my-secret-password" \
-												docs: update README with table of contents and dashboard details; refine common probes regex in values.yaml; add new IP insight dashboard image

											
										
										
											2026-03-01 21:09:01 +01:00
+								  -v krawl-data:/app/data \
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
+								  --name krawl \
 								  ghcr.io/blessedrebus/krawl:latest
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								```
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
+								Access the server at `http://localhost:5000`
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
+								### Docker Compose
 								Create a `docker-compose.yaml` file:
 								```yaml
 								services:
 								  krawl:
 								    image: ghcr.io/blessedrebus/krawl:latest
 								    container_name: krawl-server
 								    ports:
 								      - "5000:5000"
 								    environment:
 								      - CONFIG_LOCATION=config.yaml
-												Fixed TZ variable in example compose file.
Fixed CANARY_TOKEN_URL variable in the example.
Defined an example of how to use Krawl behind a reverse proxy.

											
										
										
											2026-02-16 20:30:49 +01:00
+								      - TZ=Europe/Rome
-												feat: add dashboard password configuration to README and docker-compose

											
										
										
											2026-03-06 22:40:16 +01:00
+								      # - KRAWL_DASHBOARD_PASSWORD=my-secret-password
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
+								    volumes:
 								      - ./config.yaml:/app/config.yaml:ro
-												added configuration variable documentation and filename documentation

											
										
										
											2026-02-02 14:54:36 +01:00
+								      # bind mount for firewall exporters
 								      - ./exports:/app/exports
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
+								      - krawl-data:/app/data
 								    restart: unless-stopped
 								volumes:
 								  krawl-data:
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								```
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
+								Run with:
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
 								```bash
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
+								docker-compose up -d
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								```
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
+								Stop with:
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
+								```bash
 								docker-compose down
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								```
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
+								### Kubernetes
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								**Krawl is also available natively on Kubernetes**. Installation can be done either [via manifest](kubernetes/README.md) or [using the helm chart](helm/README.md).
-												Feat/deployment update (#56)

* feat: update analyzer thresholds and add crawl configuration options

* feat: update Helm chart version and add README for installation instructions

* feat: update installation instructions in README and add Docker support

* feat: update deployment manifests and configuration for improved service handling and analyzer settings

* feat: add API endpoint for paginated IP retrieval and enhance dashboard visualization with category filters

* feat: update configuration for Krawl service to use external config file

* feat: refactor code for improved readability and consistency across multiple files

* feat: remove Flake8, Pylint, and test steps from PR checks workflow
											
										
										
											2026-01-26 12:36:22 +01:00
-												feat: update README and Helm chart for version 1.1.3 release

											
										
										
											2026-03-04 15:00:26 +01:00
+								### Python + Uvicorn
 								Run Krawl directly with Python (suggested version 13) and uvicorn for local development or testing:
 								```bash
 								pip install -r requirements.txt
 								uvicorn app:app --host 0.0.0.0 --port 5000 --app-dir src
 								```
 								Access the server at `http://localhost:5000`
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
 								## Configuration
 								Krawl uses a **configuration hierarchy** in which **environment variables take precedence over the configuration file**. This approach is recommended for Docker deployments and quick out-of-the-box customization.
-												docs: update README with table of contents and dashboard details; refine common probes regex in values.yaml; add new IP insight dashboard image

											
										
										
											2026-03-01 21:09:01 +01:00
+								### Configuration via config.yaml
 								You can use the [config.yaml](config.yaml) file for advanced configurations, such as Docker Compose or Helm chart deployments.
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								### Configuration via Enviromental Variables
 								| Environment Variable | Description | Default |
 								|----------------------|-------------|---------|
 								| `CONFIG_LOCATION` | Path to yaml config file | `config.yaml` |
 								| `KRAWL_PORT` | Server listening port | `5000` |
 								| `KRAWL_DELAY` | Response delay in milliseconds | `100` |
 								| `KRAWL_SERVER_HEADER` | HTTP Server header for deception | `""` |
 								| `KRAWL_LINKS_LENGTH_RANGE` | Link length range as `min,max` | `5,15` |
 								| `KRAWL_LINKS_PER_PAGE_RANGE` | Links per page as `min,max` | `10,15` |
 								| `KRAWL_CHAR_SPACE` | Characters used for link generation | `abcdefgh...` |
 								| `KRAWL_MAX_COUNTER` | Initial counter value | `10` |
 								| `KRAWL_CANARY_TOKEN_URL` | External canary token URL | None |
 								| `KRAWL_CANARY_TOKEN_TRIES` | Requests before showing canary token | `10` |
 								| `KRAWL_DASHBOARD_SECRET_PATH` | Custom dashboard path | Auto-generated |
-												feat: add dashboard password configuration to README and docker-compose

											
										
										
											2026-03-06 22:40:16 +01:00
+								| `KRAWL_DASHBOARD_PASSWORD` | Password for protected dashboard panels | Auto-generated |
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								| `KRAWL_PROBABILITY_ERROR_CODES` | Error response probability (0-100%) | `0` |
 								| `KRAWL_DATABASE_PATH` | Database file location | `data/krawl.db` |
-												added configuration variable documentation and filename documentation

											
										
										
											2026-02-02 14:54:36 +01:00
+								| `KRAWL_EXPORTS_PATH` | Path where firewalls rule sets are exported | `exports` |
-												updated cron with configuration variables

											
										
										
											2026-02-05 17:57:29 +01:00
+								| `KRAWL_BACKUPS_PATH` | Path where database dump are saved | `backups` |
 								| `KRAWL_BACKUPS_CRON` | cron expression to control backup job schedule | `*/30 * * * *` |
-												added parameter in config file to disable backup job

											
										
										
											2026-02-22 16:01:39 +01:00
+								| `KRAWL_BACKUPS_ENABLED` | Boolean to enable db dump job | `true` |
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								| `KRAWL_DATABASE_RETENTION_DAYS` | Days to retain data in database | `30` |
 								| `KRAWL_HTTP_RISKY_METHODS_THRESHOLD` | Threshold for risky HTTP methods detection | `0.1` |
 								| `KRAWL_VIOLATED_ROBOTS_THRESHOLD` | Threshold for robots.txt violations | `0.1` |
 								| `KRAWL_UNEVEN_REQUEST_TIMING_THRESHOLD` | Coefficient of variation threshold for timing | `0.5` |
 								| `KRAWL_UNEVEN_REQUEST_TIMING_TIME_WINDOW_SECONDS` | Time window for request timing analysis in seconds | `300` |
 								| `KRAWL_USER_AGENTS_USED_THRESHOLD` | Threshold for detecting multiple user agents | `2` |
 								| `KRAWL_ATTACK_URLS_THRESHOLD` | Threshold for attack URL detection | `1` |
-												Feat/release 1.0.0 (#63)

* Feat: update Kubernetes manifests for Krawl deployment and improve resource labels

* Feat: update version to 1.0.0 in Helm chart and related files; add timezone to README

* Feat: enhance configuration options for handling malicious IPs and update dashboard secret path

* Fix: standardize boolean value handling in environment configuration
											
										
										
											2026-01-29 14:32:10 +01:00
+								| `KRAWL_INFINITE_PAGES_FOR_MALICIOUS` | Serve infinite pages to malicious IPs | `true` |
 								| `KRAWL_MAX_PAGES_LIMIT` | Maximum page limit for crawlers | `250` |
 								| `KRAWL_BAN_DURATION_SECONDS` | Ban duration in seconds for rate-limited IPs | `600` |
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
 								For example
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
 								```bash
 								# Set canary token
-												added configuration variable documentation and filename documentation

											
										
										
											2026-02-02 14:54:36 +01:00
+								export CONFIG_LOCATION="config.yaml"
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								export KRAWL_CANARY_TOKEN_URL="http://your-canary-token-url"
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								# Set number of pages range (min,max format)
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								export KRAWL_LINKS_PER_PAGE_RANGE="5,25"
 								# Set analyzer thresholds
 								export KRAWL_HTTP_RISKY_METHODS_THRESHOLD="0.2"
 								export KRAWL_VIOLATED_ROBOTS_THRESHOLD="0.15"
-												feat: add dashboard password configuration to README and docker-compose

											
										
										
											2026-03-06 22:40:16 +01:00
+								# Set custom dashboard path and password
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								export KRAWL_DASHBOARD_SECRET_PATH="/my-secret-dashboard"
-												feat: add dashboard password configuration to README and docker-compose

											
										
										
											2026-03-06 22:40:16 +01:00
+								export KRAWL_DASHBOARD_PASSWORD="my-secret-password"
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								```
-												Doc/updated documentation (#60)

* added documentation, updated repo pointer in the dashboard, added dashboard link highlighting and mionor fixes

* added doc

* added logo to dashboard

* Fixed dashboard attack chart

* Enhance fake data generation with varied request counts for better visualization

* Add automatic migrations and support for latitude/longitude in IP stats

* Update Helm chart version to 0.2.2 and add timezone configuration option

---------

Co-authored-by: BlessedRebuS <patrick.difa@gmail.com>
											
										
										
											2026-01-29 11:55:06 +01:00
+								Example of a Docker run with env variables:
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
 								```bash
 								docker run -d \
 								  -p 5000:5000 \
 								  -e KRAWL_PORT=5000 \
 								  -e KRAWL_DELAY=100 \
-												feat: add dashboard password configuration to README and docker-compose

											
										
										
											2026-03-06 22:40:16 +01:00
+								  -e KRAWL_DASHBOARD_PASSWORD="my-secret-password" \
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
+								  -e KRAWL_CANARY_TOKEN_URL="http://your-canary-token-url" \
 								  --name krawl \
 								  ghcr.io/blessedrebus/krawl:latest
 								```
-												docs: update README with table of contents and dashboard details; refine common probes regex in values.yaml; add new IP insight dashboard image

											
										
										
											2026-03-01 21:09:01 +01:00
+								## Use Krawl to Ban Malicious IPs
 								Krawl uses a reputation-based system to classify attacker IP addresses. Every five minutes, Krawl exports the identified malicious IPs to a `malicious_ips.txt` file.
 								This file can either be mounted from the Docker container into another system or downloaded directly via `curl`:
 								```bash
 								curl https://your-krawl-instance/<DASHBOARD-PATH>/api/download/malicious_ips.txt
 								```
 								This file enables automatic blocking of malicious traffic across various platforms. You can use it to update firewall rules on:
 								* [OPNsense and pfSense](https://www.allthingstech.ch/using-opnsense-and-ip-blocklists-to-block-malicious-traffic)
 								* [RouterOS](https://rentry.co/krawl-routeros)
 								* [IPtables](plugins/iptables/README.md) and [Nftables](plugins/nftables/README.md)
 								* [Fail2Ban](plugins/fail2ban/README.md)
 								## IP Reputation
 								Krawl [uses tasks that analyze recent traffic to build and continuously update an IP reputation](src/tasks/analyze_ips.py) score. It runs periodically and evaluates each active IP address based on multiple behavioral indicators to classify it as an attacker, crawler, or regular user. Thresholds are fully customizable.
 								![ip reputation](img/ip-reputation.png)
 								The analysis includes:
 								- **Risky HTTP methods usage** (e.g. POST, PUT, DELETE ratios)
 								- **Robots.txt violations**
 								- **Request timing anomalies** (bursty or irregular patterns)
 								- **User-Agent consistency**
 								- **Attack URL detection** (e.g. SQL injection, XSS patterns)
 								Each signal contributes to a weighted scoring model that assigns a reputation category:
 								- `attacker`
 								- `bad_crawler`
 								- `good_crawler`
 								- `regular_user`
 								- `unknown` (for insufficient data)
 								The resulting scores and metrics are stored in the database and used by Krawl to drive dashboards, reputation tracking, and automated mitigation actions such as IP banning or firewall integration.
 								## Forward server header
 								If Krawl is deployed behind a proxy such as NGINX the **server header** should be forwarded using the following configuration in your proxy:
 								```bash
 								location / {
 								    proxy_pass https://your-krawl-instance;
 								    proxy_pass_header Server;
 								}
 								```
-												docs: add comprehensive documentation for API, backups, canary token, dashboard, honeypot, reverse proxy, and wordlist customization

											
										
										
											2026-03-01 21:20:33 +01:00
+								## Additional Documentation
 								| Topic | Description |
 								|-------|-------------|
 								| [API](docs/api.md) | External APIs used by Krawl for IP data, reputation, and geolocation |
 								| [Honeypot](docs/honeypot.md) | Full overview of honeypot pages: fake logins, directory listings, credential files, SQLi/XSS/XXE/command injection traps, and more |
 								| [Reverse Proxy](docs/reverse-proxy.md) | How to deploy Krawl behind NGINX or use decoy subdomains |
 								| [Database Backups](docs/backups.md) | Enable and configure the automatic database dump job |
 								| [Canary Token](docs/canary-token.md) | Set up external alert triggers via canarytokens.org |
 								| [Wordlist](docs/wordlist.md) | Customize fake usernames, passwords, and directory listings |
 								| [Dashboard](docs/dashboard.md) | Access and explore the real-time monitoring dashboard |
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
 								## 🤝 Contributing
 								Contributions welcome! Please:
 . Fork the repository
 . Create a feature branch
 . Make your changes
 . Submit a pull request (explain the changes!)
-												docs: update disclaimer section for clarity and formatting

											
										
										
											2026-03-01 21:38:01 +01:00
+								## Disclaimer
-												docs: consolidate disclaimer for clarity and emphasis on responsible use

											
										
										
											2026-03-01 21:42:47 +01:00
+								> [!CAUTION]
 								> This is a deception/honeypot system. Deploy in isolated environments and monitor carefully for security events. Use responsibly and in compliance with applicable laws and regulations.
-												Configuration override from environment variable (#47)

* Add environment variable override for config fields

Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields.

* update chart version to 0.1.4

* Update README.md to enhance environment variable configuration details and improve overall clarity
											
										
										
											2026-01-23 17:34:23 +01:00
 								## Star History
-												docs: update disclaimer for clarity and emphasize responsible deployment

											
										
										
											2026-03-01 21:44:05 +01:00
+								<img src="https://api.star-history.com/svg?repos=BlessedRebuS/Krawl&type=Date" width="600" alt="Star History Chart" />