Configuration override from environment variable (#47)
* Add environment variable override for config fields Introduces functions to override configuration fields from environment variables, allowing dynamic configuration without modifying YAML files. The environment variable names are generated from field names, and type conversion is handled for int, float, and tuple fields. * update chart version to 0.1.4 * Update README.md to enhance environment variable configuration details and improve overall clarity
This commit is contained in:
committed by
GitHub
parent
e1444e44ee
commit
223883a781
694
README.md
694
README.md
@@ -1,323 +1,371 @@
|
|||||||
<h1 align="center">🕷️ Krawl</h1>
|
<h1 align="center">🕷️ Krawl</h1>
|
||||||
|
|
||||||
<h3 align="center">
|
<h3 align="center">
|
||||||
<a name="readme-top"></a>
|
<a name="readme-top"></a>
|
||||||
<img
|
<img
|
||||||
src="img/krawl-logo.jpg"
|
src="img/krawl-logo.jpg"
|
||||||
height="200"
|
height="200"
|
||||||
>
|
>
|
||||||
</h3>
|
</h3>
|
||||||
<div align="center">
|
<div align="center">
|
||||||
|
|
||||||
<p align="center">
|
<p align="center">
|
||||||
A modern, customizable zero-dependencies honeypot server designed to detect and track malicious activity through deceptive web pages, fake credentials, and canary tokens.
|
A modern, customizable zero-dependencies honeypot server designed to detect and track malicious activity through deceptive web pages, fake credentials, and canary tokens.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<div align="center">
|
<div align="center">
|
||||||
<a href="https://github.com/blessedrebus/krawl/blob/main/LICENSE">
|
<a href="https://github.com/blessedrebus/krawl/blob/main/LICENSE">
|
||||||
<img src="https://img.shields.io/github/license/blessedrebus/krawl" alt="License">
|
<img src="https://img.shields.io/github/license/blessedrebus/krawl" alt="License">
|
||||||
</a>
|
</a>
|
||||||
<a href="https://github.com/blessedrebus/krawl/releases">
|
<a href="https://github.com/blessedrebus/krawl/releases">
|
||||||
<img src="https://img.shields.io/github/v/release/blessedrebus/krawl" alt="Release">
|
<img src="https://img.shields.io/github/v/release/blessedrebus/krawl" alt="Release">
|
||||||
</a>
|
</a>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<div align="center">
|
<div align="center">
|
||||||
<a href="https://ghcr.io/blessedrebus/krawl">
|
<a href="https://ghcr.io/blessedrebus/krawl">
|
||||||
<img src="https://img.shields.io/badge/ghcr.io-krawl-blue" alt="GitHub Container Registry">
|
<img src="https://img.shields.io/badge/ghcr.io-krawl-blue" alt="GitHub Container Registry">
|
||||||
</a>
|
</a>
|
||||||
<a href="https://kubernetes.io/">
|
<a href="https://kubernetes.io/">
|
||||||
<img src="https://img.shields.io/badge/kubernetes-ready-326CE5?logo=kubernetes&logoColor=white" alt="Kubernetes">
|
<img src="https://img.shields.io/badge/kubernetes-ready-326CE5?logo=kubernetes&logoColor=white" alt="Kubernetes">
|
||||||
</a>
|
</a>
|
||||||
<a href="https://github.com/BlessedRebuS/Krawl/pkgs/container/krawl-chart">
|
<a href="https://github.com/BlessedRebuS/Krawl/pkgs/container/krawl-chart">
|
||||||
<img src="https://img.shields.io/badge/helm-chart-0F1689?logo=helm&logoColor=white" alt="Helm Chart">
|
<img src="https://img.shields.io/badge/helm-chart-0F1689?logo=helm&logoColor=white" alt="Helm Chart">
|
||||||
</a>
|
</a>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<br>
|
<br>
|
||||||
|
|
||||||
<p align="center">
|
<p align="center">
|
||||||
<a href="#what-is-krawl">What is Krawl?</a> •
|
<a href="#what-is-krawl">What is Krawl?</a> •
|
||||||
<a href="#-quick-start">Quick Start</a> •
|
<a href="#-quick-start">Quick Start</a> •
|
||||||
<a href="#honeypot-pages">Honeypot Pages</a> •
|
<a href="#honeypot-pages">Honeypot Pages</a> •
|
||||||
<a href="#dashboard">Dashboard</a> •
|
<a href="#dashboard">Dashboard</a> •
|
||||||
<a href="./ToDo.md">Todo</a> •
|
<a href="./ToDo.md">Todo</a> •
|
||||||
<a href="#-contributing">Contributing</a>
|
<a href="#-contributing">Contributing</a>
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<br>
|
<br>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
## Demo
|
## Demo
|
||||||
Tip: crawl the `robots.txt` paths for additional fun
|
Tip: crawl the `robots.txt` paths for additional fun
|
||||||
### Krawl URL: [http://demo.krawlme.com](http://demo.krawlme.com)
|
### Krawl URL: [http://demo.krawlme.com](http://demo.krawlme.com)
|
||||||
### View the dashboard [http://demo.krawlme.com/das_dashboard](http://demo.krawlme.com/das_dashboard)
|
### View the dashboard [http://demo.krawlme.com/das_dashboard](http://demo.krawlme.com/das_dashboard)
|
||||||
|
|
||||||
## What is Krawl?
|
## What is Krawl?
|
||||||
|
|
||||||
**Krawl** is a cloud‑native deception server designed to detect, delay, and analyze malicious web crawlers and automated scanners.
|
**Krawl** is a cloud‑native deception server designed to detect, delay, and analyze malicious web crawlers and automated scanners.
|
||||||
|
|
||||||
It creates realistic fake web applications filled with low‑hanging fruit such as admin panels, configuration files, and exposed fake credentials to attract and identify suspicious activity.
|
It creates realistic fake web applications filled with low‑hanging fruit such as admin panels, configuration files, and exposed fake credentials to attract and identify suspicious activity.
|
||||||
|
|
||||||
By wasting attacker resources, Krawl helps clearly distinguish malicious behavior from legitimate crawlers.
|
By wasting attacker resources, Krawl helps clearly distinguish malicious behavior from legitimate crawlers.
|
||||||
|
|
||||||
It features:
|
It features:
|
||||||
|
|
||||||
- **Spider Trap Pages**: Infinite random links to waste crawler resources based on the [spidertrap project](https://github.com/adhdproject/spidertrap)
|
- **Spider Trap Pages**: Infinite random links to waste crawler resources based on the [spidertrap project](https://github.com/adhdproject/spidertrap)
|
||||||
- **Fake Login Pages**: WordPress, phpMyAdmin, admin panels
|
- **Fake Login Pages**: WordPress, phpMyAdmin, admin panels
|
||||||
- **Honeypot Paths**: Advertised in robots.txt to catch scanners
|
- **Honeypot Paths**: Advertised in robots.txt to catch scanners
|
||||||
- **Fake Credentials**: Realistic-looking usernames, passwords, API keys
|
- **Fake Credentials**: Realistic-looking usernames, passwords, API keys
|
||||||
- **[Canary Token](#customizing-the-canary-token) Integration**: External alert triggering
|
- **[Canary Token](#customizing-the-canary-token) Integration**: External alert triggering
|
||||||
- **Real-time Dashboard**: Monitor suspicious activity
|
- **Real-time Dashboard**: Monitor suspicious activity
|
||||||
- **Customizable Wordlists**: Easy JSON-based configuration
|
- **Customizable Wordlists**: Easy JSON-based configuration
|
||||||
- **Random Error Injection**: Mimic real server behavior
|
- **Random Error Injection**: Mimic real server behavior
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
## 🚀 Quick Start
|
## 🚀 Quick Start
|
||||||
## Helm Chart
|
## Helm Chart
|
||||||
|
|
||||||
Install with default values
|
Install with default values
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
helm install krawl oci://ghcr.io/blessedrebus/krawl-chart \
|
helm install krawl oci://ghcr.io/blessedrebus/krawl-chart \
|
||||||
--namespace krawl-system \
|
--namespace krawl-system \
|
||||||
--create-namespace
|
--create-namespace
|
||||||
```
|
```
|
||||||
|
|
||||||
Install with custom [canary token](#customizing-the-canary-token)
|
Install with custom [canary token](#customizing-the-canary-token)
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
helm install krawl oci://ghcr.io/blessedrebus/krawl-chart \
|
helm install krawl oci://ghcr.io/blessedrebus/krawl-chart \
|
||||||
--namespace krawl-system \
|
--namespace krawl-system \
|
||||||
--create-namespace \
|
--create-namespace \
|
||||||
--set config.canaryTokenUrl="http://your-canary-token-url"
|
--set config.canaryTokenUrl="http://your-canary-token-url"
|
||||||
```
|
```
|
||||||
|
|
||||||
To access the deception server
|
To access the deception server
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
kubectl get svc krawl -n krawl-system
|
kubectl get svc krawl -n krawl-system
|
||||||
```
|
```
|
||||||
|
|
||||||
Once the EXTERNAL-IP is assigned, access your deception server at:
|
Once the EXTERNAL-IP is assigned, access your deception server at:
|
||||||
|
|
||||||
```
|
```
|
||||||
http://<EXTERNAL-IP>:5000
|
http://<EXTERNAL-IP>:5000
|
||||||
```
|
```
|
||||||
|
|
||||||
## Kubernetes / Kustomize
|
## Kubernetes / Kustomize
|
||||||
Apply all manifests with
|
Apply all manifests with
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
kubectl apply -f https://raw.githubusercontent.com/BlessedRebuS/Krawl/refs/heads/main/manifests/krawl-all-in-one-deploy.yaml
|
kubectl apply -f https://raw.githubusercontent.com/BlessedRebuS/Krawl/refs/heads/main/manifests/krawl-all-in-one-deploy.yaml
|
||||||
```
|
```
|
||||||
|
|
||||||
Retrieve dashboard path with
|
Retrieve dashboard path with
|
||||||
```bash
|
```bash
|
||||||
kubectl get secret krawl-server -n krawl-system -o jsonpath='{.data.dashboard-path}' | base64 -d
|
kubectl get secret krawl-server -n krawl-system -o jsonpath='{.data.dashboard-path}' | base64 -d
|
||||||
```
|
```
|
||||||
|
|
||||||
Or clone the repo and apply the `manifest` folder with
|
Or clone the repo and apply the `manifest` folder with
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
kubectl apply -k manifests
|
kubectl apply -k manifests
|
||||||
```
|
```
|
||||||
|
|
||||||
## Docker
|
## Docker
|
||||||
Run Krawl as a docker container with
|
Run Krawl as a docker container with
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
docker run -d \
|
docker run -d \
|
||||||
-p 5000:5000 \
|
-p 5000:5000 \
|
||||||
-e CANARY_TOKEN_URL="http://your-canary-token-url" \
|
-e CANARY_TOKEN_URL="http://your-canary-token-url" \
|
||||||
--name krawl \
|
--name krawl \
|
||||||
ghcr.io/blessedrebus/krawl:latest
|
ghcr.io/blessedrebus/krawl:latest
|
||||||
```
|
```
|
||||||
|
|
||||||
## Docker Compose
|
## Docker Compose
|
||||||
Run Krawl with docker-compose in the project folder with
|
Run Krawl with docker-compose in the project folder with
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
docker-compose up -d
|
docker-compose up -d
|
||||||
```
|
```
|
||||||
|
|
||||||
Stop it with
|
Stop it with
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
docker-compose down
|
docker-compose down
|
||||||
```
|
```
|
||||||
|
|
||||||
## Python 3.11+
|
## Python 3.11+
|
||||||
|
|
||||||
Clone the repository
|
Clone the repository
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
git clone https://github.com/blessedrebus/krawl.git
|
git clone https://github.com/blessedrebus/krawl.git
|
||||||
cd krawl/src
|
cd krawl/src
|
||||||
```
|
```
|
||||||
Run the server
|
Run the server
|
||||||
```bash
|
```bash
|
||||||
python3 server.py
|
python3 server.py
|
||||||
```
|
```
|
||||||
|
|
||||||
Visit
|
Visit
|
||||||
|
|
||||||
`http://localhost:5000`
|
`http://localhost:5000`
|
||||||
|
|
||||||
To access the dashboard
|
To access the dashboard
|
||||||
|
|
||||||
`http://localhost:5000/<dashboard-secret-path>`
|
`http://localhost:5000/<dashboard-secret-path>`
|
||||||
|
|
||||||
## Configuration via Environment Variables
|
## Configuration via Environment Variables
|
||||||
|
|
||||||
To customize the deception server installation several **environment variables** can be specified.
|
To customize the deception server installation, environment variables can be specified using the naming convention: `KRAWL_<FIELD_NAME>` where `<FIELD_NAME>` is the configuration field name in uppercase with special characters converted:
|
||||||
|
- `.` → `_`
|
||||||
| Variable | Description | Default |
|
- `-` → `__` (double underscore)
|
||||||
|----------|-------------|---------|
|
- ` ` (space) → `_`
|
||||||
| `PORT` | Server listening port | `5000` |
|
|
||||||
| `DELAY` | Response delay in milliseconds | `100` |
|
### Configuration Variables
|
||||||
| `LINKS_MIN_LENGTH` | Minimum random link length | `5` |
|
|
||||||
| `LINKS_MAX_LENGTH` | Maximum random link length | `15` |
|
| Configuration Field | Environment Variable | Description | Default |
|
||||||
| `LINKS_MIN_PER_PAGE` | Minimum links per page | `10` |
|
|-----------|-----------|-------------|---------|
|
||||||
| `LINKS_MAX_PER_PAGE` | Maximum links per page | `15` |
|
| `port` | `KRAWL_PORT` | Server listening port | `5000` |
|
||||||
| `MAX_COUNTER` | Initial counter value | `10` |
|
| `delay` | `KRAWL_DELAY` | Response delay in milliseconds | `100` |
|
||||||
| `CANARY_TOKEN_TRIES` | Requests before showing canary token | `10` |
|
| `server_header` | `KRAWL_SERVER_HEADER` | HTTP Server header for deception | `""` |
|
||||||
| `CANARY_TOKEN_URL` | External canary token URL | None |
|
| `links_length_range` | `KRAWL_LINKS_LENGTH_RANGE` | Link length range as `min,max` | `5,15` |
|
||||||
| `DASHBOARD_SECRET_PATH` | Custom dashboard path | Auto-generated |
|
| `links_per_page_range` | `KRAWL_LINKS_PER_PAGE_RANGE` | Links per page as `min,max` | `10,15` |
|
||||||
| `PROBABILITY_ERROR_CODES` | Error response probability (0-100%) | `0` |
|
| `char_space` | `KRAWL_CHAR_SPACE` | Characters used for link generation | `abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789` |
|
||||||
| `SERVER_HEADER` | HTTP Server header for deception | `Apache/2.2.22 (Ubuntu)` |
|
| `max_counter` | `KRAWL_MAX_COUNTER` | Initial counter value | `10` |
|
||||||
| `TIMEZONE` | IANA timezone for logs and dashboard (e.g., `America/New_York`, `Europe/Rome`) | System timezone |
|
| `canary_token_url` | `KRAWL_CANARY_TOKEN_URL` | External canary token URL | None |
|
||||||
|
| `canary_token_tries` | `KRAWL_CANARY_TOKEN_TRIES` | Requests before showing canary token | `10` |
|
||||||
## robots.txt
|
| `dashboard_secret_path` | `KRAWL_DASHBOARD_SECRET_PATH` | Custom dashboard path | Auto-generated |
|
||||||
The actual (juicy) robots.txt configuration is the following
|
| `api_server_url` | `KRAWL_API_SERVER_URL` | API server URL | None |
|
||||||
|
| `api_server_port` | `KRAWL_API_SERVER_PORT` | API server port | `8080` |
|
||||||
```txt
|
| `api_server_path` | `KRAWL_API_SERVER_PATH` | API server endpoint path | `/api/v2/users` |
|
||||||
Disallow: /admin/
|
| `probability_error_codes` | `KRAWL_PROBABILITY_ERROR_CODES` | Error response probability (0-100%) | `0` |
|
||||||
Disallow: /api/
|
| `database_path` | `KRAWL_DATABASE_PATH` | Database file location | `data/krawl.db` |
|
||||||
Disallow: /backup/
|
| `database_retention_days` | `KRAWL_DATABASE_RETENTION_DAYS` | Days to retain data in database | `30` |
|
||||||
Disallow: /config/
|
| `http_risky_methods_threshold` | `KRAWL_HTTP_RISKY_METHODS_THRESHOLD` | Threshold for risky HTTP methods detection | `0.1` |
|
||||||
Disallow: /database/
|
| `violated_robots_threshold` | `KRAWL_VIOLATED_ROBOTS_THRESHOLD` | Threshold for robots.txt violations | `0.1` |
|
||||||
Disallow: /private/
|
| `uneven_request_timing_threshold` | `KRAWL_UNEVEN_REQUEST_TIMING_THRESHOLD` | Coefficient of variation threshold for timing | `0.5` |
|
||||||
Disallow: /uploads/
|
| `uneven_request_timing_time_window_seconds` | `KRAWL_UNEVEN_REQUEST_TIMING_TIME_WINDOW_SECONDS` | Time window for request timing analysis in seconds | `300` |
|
||||||
Disallow: /wp-admin/
|
| `user_agents_used_threshold` | `KRAWL_USER_AGENTS_USED_THRESHOLD` | Threshold for detecting multiple user agents | `2` |
|
||||||
Disallow: /phpMyAdmin/
|
| `attack_urls_threshold` | `KRAWL_ATTACK_URLS_THRESHOLD` | Threshold for attack URL detection | `1` |
|
||||||
Disallow: /admin/login.php
|
|
||||||
Disallow: /api/v1/users
|
### Examples
|
||||||
Disallow: /api/v2/secrets
|
|
||||||
Disallow: /.env
|
```bash
|
||||||
Disallow: /credentials.txt
|
# Set port and delay
|
||||||
Disallow: /passwords.txt
|
export KRAWL_PORT=8080
|
||||||
Disallow: /.git/
|
export KRAWL_DELAY=200
|
||||||
Disallow: /backup.sql
|
|
||||||
Disallow: /db_backup.sql
|
# Set canary token
|
||||||
```
|
export KRAWL_CANARY_TOKEN_URL="http://your-canary-token-url"
|
||||||
|
|
||||||
## Honeypot pages
|
# Set tuple values (min,max format)
|
||||||
Requests to common admin endpoints (`/admin/`, `/wp-admin/`, `/phpMyAdmin/`) return a fake login page. Any login attempt triggers a 1-second delay to simulate real processing and is fully logged in the dashboard (credentials, IP, headers, timing).
|
export KRAWL_LINKS_LENGTH_RANGE="3,20"
|
||||||
|
export KRAWL_LINKS_PER_PAGE_RANGE="5,25"
|
||||||
<div align="center">
|
|
||||||
<img src="img/admin-page.png" width="60%" />
|
# Set analyzer thresholds
|
||||||
</div>
|
export KRAWL_HTTP_RISKY_METHODS_THRESHOLD="0.2"
|
||||||
|
export KRAWL_VIOLATED_ROBOTS_THRESHOLD="0.15"
|
||||||
Requests to paths like `/backup/`, `/config/`, `/database/`, `/private/`, or `/uploads/` return a fake directory listing populated with “interesting” files, each assigned a random file size to look realistic.
|
|
||||||
|
# Set custom dashboard path
|
||||||

|
export KRAWL_DASHBOARD_SECRET_PATH="/my-secret-dashboard"
|
||||||
|
```
|
||||||
The `.env` endpoint exposes fake database connection strings, **AWS API keys**, and **Stripe secrets**. It intentionally returns an error due to the `Content-Type` being `application/json` instead of plain text, mimicking a “juicy” misconfiguration that crawlers and scanners often flag as information leakage.
|
|
||||||
|
Or in Docker:
|
||||||

|
|
||||||
|
```bash
|
||||||
The pages `/api/v1/users` and `/api/v2/secrets` show fake users and random secrets in JSON format
|
docker run -d \
|
||||||
|
-p 5000:5000 \
|
||||||
<div align="center">
|
-e KRAWL_PORT=5000 \
|
||||||
<img src="img/api-users-page.png" width="45%" style="vertical-align: middle; margin: 0 10px;" />
|
-e KRAWL_DELAY=100 \
|
||||||
<img src="img/api-secrets-page.png" width="45%" style="vertical-align: middle; margin: 0 10px;" />
|
-e KRAWL_CANARY_TOKEN_URL="http://your-canary-token-url" \
|
||||||
</div>
|
--name krawl \
|
||||||
|
ghcr.io/blessedrebus/krawl:latest
|
||||||
The pages `/credentials.txt` and `/passwords.txt` show fake users and random secrets
|
```
|
||||||
|
|
||||||
<div align="center">
|
## robots.txt
|
||||||
<img src="img/credentials-page.png" width="35%" style="vertical-align: middle; margin: 0 10px;" />
|
The actual (juicy) robots.txt configuration is the following
|
||||||
<img src="img/passwords-page.png" width="45%" style="vertical-align: middle; margin: 0 10px;" />
|
|
||||||
</div>
|
```txt
|
||||||
|
Disallow: /admin/
|
||||||
## Customizing the Canary Token
|
Disallow: /api/
|
||||||
To create a custom canary token, visit https://canarytokens.org
|
Disallow: /backup/
|
||||||
|
Disallow: /config/
|
||||||
and generate a “Web bug” canary token.
|
Disallow: /database/
|
||||||
|
Disallow: /private/
|
||||||
This optional token is triggered when a crawler fully traverses the webpage until it reaches 0. At that point, a URL is returned. When this URL is requested, it sends an alert to the user via email, including the visitor’s IP address and user agent.
|
Disallow: /uploads/
|
||||||
|
Disallow: /wp-admin/
|
||||||
|
Disallow: /phpMyAdmin/
|
||||||
To enable this feature, set the canary token URL [using the environment variable](#configuration-via-environment-variables) `CANARY_TOKEN_URL`.
|
Disallow: /admin/login.php
|
||||||
|
Disallow: /api/v1/users
|
||||||
## Customizing the wordlist
|
Disallow: /api/v2/secrets
|
||||||
|
Disallow: /.env
|
||||||
Edit `wordlists.json` to customize fake data for your use case
|
Disallow: /credentials.txt
|
||||||
|
Disallow: /passwords.txt
|
||||||
```json
|
Disallow: /.git/
|
||||||
{
|
Disallow: /backup.sql
|
||||||
"usernames": {
|
Disallow: /db_backup.sql
|
||||||
"prefixes": ["admin", "root", "user"],
|
```
|
||||||
"suffixes": ["_prod", "_dev", "123"]
|
|
||||||
},
|
## Honeypot pages
|
||||||
"passwords": {
|
Requests to common admin endpoints (`/admin/`, `/wp-admin/`, `/phpMyAdmin/`) return a fake login page. Any login attempt triggers a 1-second delay to simulate real processing and is fully logged in the dashboard (credentials, IP, headers, timing).
|
||||||
"prefixes": ["P@ssw0rd", "Admin"],
|
|
||||||
"simple": ["test", "password"]
|
<div align="center">
|
||||||
},
|
<img src="img/admin-page.png" width="60%" />
|
||||||
"directory_listing": {
|
</div>
|
||||||
"files": ["credentials.txt", "backup.sql"],
|
|
||||||
"directories": ["admin/", "backup/"]
|
Requests to paths like `/backup/`, `/config/`, `/database/`, `/private/`, or `/uploads/` return a fake directory listing populated with “interesting” files, each assigned a random file size to look realistic.
|
||||||
}
|
|
||||||
}
|

|
||||||
```
|
|
||||||
|
The `.env` endpoint exposes fake database connection strings, **AWS API keys**, and **Stripe secrets**. It intentionally returns an error due to the `Content-Type` being `application/json` instead of plain text, mimicking a “juicy” misconfiguration that crawlers and scanners often flag as information leakage.
|
||||||
or **values.yaml** in the case of helm chart installation
|
|
||||||
|

|
||||||
## Dashboard
|
|
||||||
|
The pages `/api/v1/users` and `/api/v2/secrets` show fake users and random secrets in JSON format
|
||||||
Access the dashboard at `http://<server-ip>:<port>/<dashboard-path>`
|
|
||||||
|
<div align="center">
|
||||||
The dashboard shows:
|
<img src="img/api-users-page.png" width="45%" style="vertical-align: middle; margin: 0 10px;" />
|
||||||
- Total and unique accesses
|
<img src="img/api-secrets-page.png" width="45%" style="vertical-align: middle; margin: 0 10px;" />
|
||||||
- Suspicious activity detection
|
</div>
|
||||||
- Top IPs, paths, and user-agents
|
|
||||||
- Real-time monitoring
|
The pages `/credentials.txt` and `/passwords.txt` show fake users and random secrets
|
||||||
|
|
||||||
The attackers' triggered honeypot path and the suspicious activity (such as failed login attempts) are logged
|
<div align="center">
|
||||||
|
<img src="img/credentials-page.png" width="35%" style="vertical-align: middle; margin: 0 10px;" />
|
||||||

|
<img src="img/passwords-page.png" width="45%" style="vertical-align: middle; margin: 0 10px;" />
|
||||||
|
</div>
|
||||||
The top IP Addresses is shown along with top paths and User Agents
|
|
||||||
|
## Customizing the Canary Token
|
||||||

|
To create a custom canary token, visit https://canarytokens.org
|
||||||
|
|
||||||
### Retrieving Dashboard Path
|
and generate a “Web bug” canary token.
|
||||||
|
|
||||||
Check server startup logs or get the secret with
|
This optional token is triggered when a crawler fully traverses the webpage until it reaches 0. At that point, a URL is returned. When this URL is requested, it sends an alert to the user via email, including the visitor’s IP address and user agent.
|
||||||
|
|
||||||
```bash
|
|
||||||
kubectl get secret krawl-server -n krawl-system \
|
To enable this feature, set the canary token URL [using the environment variable](#configuration-via-environment-variables) `CANARY_TOKEN_URL`.
|
||||||
-o jsonpath='{.data.dashboard-path}' | base64 -d && echo
|
|
||||||
```
|
## Customizing the wordlist
|
||||||
|
|
||||||
## 🤝 Contributing
|
Edit `wordlists.json` to customize fake data for your use case
|
||||||
|
|
||||||
Contributions welcome! Please:
|
```json
|
||||||
1. Fork the repository
|
{
|
||||||
2. Create a feature branch
|
"usernames": {
|
||||||
3. Make your changes
|
"prefixes": ["admin", "root", "user"],
|
||||||
4. Submit a pull request (explain the changes!)
|
"suffixes": ["_prod", "_dev", "123"]
|
||||||
|
},
|
||||||
|
"passwords": {
|
||||||
<div align="center">
|
"prefixes": ["P@ssw0rd", "Admin"],
|
||||||
|
"simple": ["test", "password"]
|
||||||
## ⚠️ Disclaimer
|
},
|
||||||
|
"directory_listing": {
|
||||||
**This is a deception/honeypot system.**
|
"files": ["credentials.txt", "backup.sql"],
|
||||||
Deploy in isolated environments and monitor carefully for security events.
|
"directories": ["admin/", "backup/"]
|
||||||
Use responsibly and in compliance with applicable laws and regulations.
|
}
|
||||||
|
}
|
||||||
## Star History
|
```
|
||||||
<img src="https://api.star-history.com/svg?repos=BlessedRebuS/Krawl&type=Date" width="600" alt="Star History Chart" />
|
|
||||||
|
or **values.yaml** in the case of helm chart installation
|
||||||
|
|
||||||
|
## Dashboard
|
||||||
|
|
||||||
|
Access the dashboard at `http://<server-ip>:<port>/<dashboard-path>`
|
||||||
|
|
||||||
|
The dashboard shows:
|
||||||
|
- Total and unique accesses
|
||||||
|
- Suspicious activity detection
|
||||||
|
- Top IPs, paths, and user-agents
|
||||||
|
- Real-time monitoring
|
||||||
|
|
||||||
|
The attackers' triggered honeypot path and the suspicious activity (such as failed login attempts) are logged
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
The top IP Addresses is shown along with top paths and User Agents
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
### Retrieving Dashboard Path
|
||||||
|
|
||||||
|
Check server startup logs or get the secret with
|
||||||
|
|
||||||
|
```bash
|
||||||
|
kubectl get secret krawl-server -n krawl-system \
|
||||||
|
-o jsonpath='{.data.dashboard-path}' | base64 -d && echo
|
||||||
|
```
|
||||||
|
|
||||||
|
## 🤝 Contributing
|
||||||
|
|
||||||
|
Contributions welcome! Please:
|
||||||
|
1. Fork the repository
|
||||||
|
2. Create a feature branch
|
||||||
|
3. Make your changes
|
||||||
|
4. Submit a pull request (explain the changes!)
|
||||||
|
|
||||||
|
|
||||||
|
<div align="center">
|
||||||
|
|
||||||
|
## ⚠️ Disclaimer
|
||||||
|
|
||||||
|
**This is a deception/honeypot system.**
|
||||||
|
Deploy in isolated environments and monitor carefully for security events.
|
||||||
|
Use responsibly and in compliance with applicable laws and regulations.
|
||||||
|
|
||||||
|
## Star History
|
||||||
|
<img src="https://api.star-history.com/svg?repos=BlessedRebuS/Krawl&type=Date" width="600" alt="Star History Chart" />
|
||||||
|
|||||||
@@ -2,7 +2,7 @@ apiVersion: v2
|
|||||||
name: krawl-chart
|
name: krawl-chart
|
||||||
description: A Helm chart for Krawl honeypot server
|
description: A Helm chart for Krawl honeypot server
|
||||||
type: application
|
type: application
|
||||||
version: 0.1.3
|
version: 0.1.4
|
||||||
appVersion: 0.1.6
|
appVersion: 0.1.6
|
||||||
keywords:
|
keywords:
|
||||||
- honeypot
|
- honeypot
|
||||||
|
|||||||
@@ -111,13 +111,40 @@ class Config:
|
|||||||
attack_urls_threshold=analyzer.get('attack_urls_threshold', 1)
|
attack_urls_threshold=analyzer.get('attack_urls_threshold', 1)
|
||||||
)
|
)
|
||||||
|
|
||||||
|
def __get_env_from_config(config: str) -> str:
|
||||||
|
|
||||||
|
env = config.upper().replace('.', '_').replace('-', '__').replace(' ', '_')
|
||||||
|
|
||||||
|
return f'KRAWL_{env}'
|
||||||
|
|
||||||
|
def override_config_from_env(config: Config = None):
|
||||||
|
"""Initialize configuration from environment variables"""
|
||||||
|
|
||||||
|
for field in config.__dataclass_fields__:
|
||||||
|
|
||||||
|
env_var = __get_env_from_config(field)
|
||||||
|
if env_var in os.environ:
|
||||||
|
field_type = config.__dataclass_fields__[field].type
|
||||||
|
env_value = os.environ[env_var]
|
||||||
|
if field_type == int:
|
||||||
|
setattr(config, field, int(env_value))
|
||||||
|
elif field_type == float:
|
||||||
|
setattr(config, field, float(env_value))
|
||||||
|
elif field_type == Tuple[int, int]:
|
||||||
|
parts = env_value.split(',')
|
||||||
|
if len(parts) == 2:
|
||||||
|
setattr(config, field, (int(parts[0]), int(parts[1])))
|
||||||
|
else:
|
||||||
|
setattr(config, field, env_value)
|
||||||
|
|
||||||
_config_instance = None
|
_config_instance = None
|
||||||
|
|
||||||
|
|
||||||
def get_config() -> Config:
|
def get_config() -> Config:
|
||||||
"""Get the singleton Config instance"""
|
"""Get the singleton Config instance"""
|
||||||
global _config_instance
|
global _config_instance
|
||||||
if _config_instance is None:
|
if _config_instance is None:
|
||||||
_config_instance = Config.from_yaml()
|
_config_instance = Config.from_yaml()
|
||||||
return _config_instance
|
|
||||||
|
override_config_from_env(_config_instance)
|
||||||
|
|
||||||
|
return _config_instance
|
||||||
Reference in New Issue
Block a user