diff --git a/README.md b/README.md
index 720b906..1f67b21 100644
--- a/README.md
+++ b/README.md
@@ -33,19 +33,6 @@
-
-
-
-
- What is Krawl? •
- Installation •
- Honeypot Pages •
- Dashboard •
- Todo •
- Contributing
-
-
-
## Table of Contents
@@ -62,15 +49,7 @@
- [Ban Malicious IPs](#use-krawl-to-ban-malicious-ips)
- [IP Reputation](#ip-reputation)
- [Forward Server Header](#forward-server-header)
-- [API](#api)
-- [Honeypot](#honeypot)
- - [robots.txt](#robotstxt)
- - [Honeypot Pages](#honeypot-pages)
-- [Reverse Proxy Usage](#example-usage-behind-reverse-proxy)
-- [Database Backups](#enable-database-dump-job-for-backups)
-- [Canary Token](#customizing-the-canary-token)
-- [Customizing the Wordlist](#customizing-the-wordlist)
-- [Dashboard](#dashboard)
+- [Additional Documentation](#additional-documentation)
- [Contributing](#-contributing)
## Demo
@@ -92,7 +71,7 @@ It features:
- **Fake Login Pages**: WordPress, phpMyAdmin, admin panels
- **Honeypot Paths**: Advertised in robots.txt to catch scanners
- **Fake Credentials**: Realistic-looking usernames, passwords, API keys
-- **[Canary Token](#customizing-the-canary-token) Integration**: External alert triggering
+- **[Canary Token](docs/canary-token.md) Integration**: External alert triggering
- **Random server headers**: Confuse attacks based on server header and version
- **Real-time Dashboard**: Monitor suspicious activity
- **Customizable Wordlists**: Easy JSON-based configuration
@@ -289,159 +268,17 @@ location / {
}
```
-## API
-Krawl uses the following APIs
-- http://ip-api.com (IP Data)
-- https://iprep.lcrawl.com (IP Reputation)
-- https://nominatim.openstreetmap.org/reverse (Reverse IP Lookup)
-- https://api.ipify.org (Public IP discovery)
-- http://ident.me (Public IP discovery)
-- https://ifconfig.me (Public IP discovery)
+## Additional Documentation
-# Honeypot
-Below is a complete overview of the Krawl honeypot’s capabilities
-
-## robots.txt
-The actual (juicy) robots.txt configuration [is the following](src/templates/html/robots.txt).
-
-## Honeypot pages
-
-### Common Login Attempts
-Requests to common admin endpoints (`/admin/`, `/wp-admin/`, `/phpMyAdmin/`) return a fake login page. Any login attempt triggers a 1-second delay to simulate real processing and is fully logged in the dashboard (credentials, IP, headers, timing).
-
-
-
-### Common Misconfiguration Paths
-Requests to paths like `/backup/`, `/config/`, `/database/`, `/private/`, or `/uploads/` return a fake directory listing populated with “interesting” files, each assigned a random file size to look realistic.
-
-
-
-### Environment File Leakage
-The `.env` endpoint exposes fake database connection strings, **AWS API keys**, and **Stripe secrets**. It intentionally returns an error due to the `Content-Type` being `application/json` instead of plain text, mimicking a "juicy" misconfiguration that crawlers and scanners often flag as information leakage.
-
-### Server Error Information
-The `/server` page displays randomly generated fake error information for each known server.
-
-
-
-### API Endpoints with Sensitive Data
-The pages `/api/v1/users` and `/api/v2/secrets` show fake users and random secrets in JSON format
-
-
-
-### Exposed Credential Files
-The pages `/credentials.txt` and `/passwords.txt` show fake users and random secrets
-
-
-
-### SQL Injection and XSS Detection
-Pages such as `/users`, `/search`, `/contact`, `/info`, `/input`, and `/feedback`, along with APIs like `/api/sql` and `/api/database`, are designed to lure attackers into performing attacks such as **SQL injection** or **XSS**.
-
-
-
-Automated tools like **SQLMap** will receive a different randomized database error on each request, increasing scan noise and confusing the attacker. All detected attacks are logged and displayed in the dashboard.
-
-### Path Traversal Detection
-Krawl detects and responds to **path traversal** attempts targeting common system files like `/etc/passwd`, `/etc/shadow`, or Windows system paths. When an attacker tries to access sensitive files using patterns like `../../../etc/passwd` or encoded variants (`%2e%2e/`, `%252e`), Krawl returns convincing fake file contents with realistic system users, UIDs, GIDs, and shell configurations. This wastes attacker time while logging the full attack pattern.
-
-### XXE (XML External Entity) Injection
-The `/api/xml` and `/api/parser` endpoints accept XML input and are designed to detect **XXE injection** attempts. When attackers try to exploit external entity declarations (`:/`
-
-The dashboard shows:
-- Total and unique accesses
-- Suspicious activity and attack detection
-- Top IPs, paths, user-agents and GeoIP localization
-- Real-time monitoring
-
-The attackers’ access to the honeypot endpoint and related suspicious activities (such as failed login attempts) are logged.
-
-Krawl also implements a scoring system designed to distinguish between malicious and legitimate behavior on the website.
-
-
-
-The top IP Addresses is shown along with top paths and User Agents
-
-
-
-
+| Topic | Description |
+|-------|-------------|
+| [API](docs/api.md) | External APIs used by Krawl for IP data, reputation, and geolocation |
+| [Honeypot](docs/honeypot.md) | Full overview of honeypot pages: fake logins, directory listings, credential files, SQLi/XSS/XXE/command injection traps, and more |
+| [Reverse Proxy](docs/reverse-proxy.md) | How to deploy Krawl behind NGINX or use decoy subdomains |
+| [Database Backups](docs/backups.md) | Enable and configure the automatic database dump job |
+| [Canary Token](docs/canary-token.md) | Set up external alert triggers via canarytokens.org |
+| [Wordlist](docs/wordlist.md) | Customize fake usernames, passwords, and directory listings |
+| [Dashboard](docs/dashboard.md) | Access and explore the real-time monitoring dashboard |
## 🤝 Contributing
diff --git a/docs/api.md b/docs/api.md
new file mode 100644
index 0000000..8d4ab18
--- /dev/null
+++ b/docs/api.md
@@ -0,0 +1,9 @@
+# API
+
+Krawl uses the following APIs
+- http://ip-api.com (IP Data)
+- https://iprep.lcrawl.com (IP Reputation)
+- https://nominatim.openstreetmap.org/reverse (Reverse IP Lookup)
+- https://api.ipify.org (Public IP discovery)
+- http://ident.me (Public IP discovery)
+- https://ifconfig.me (Public IP discovery)
diff --git a/docs/backups.md b/docs/backups.md
new file mode 100644
index 0000000..84bf5db
--- /dev/null
+++ b/docs/backups.md
@@ -0,0 +1,10 @@
+# Enable Database Dump Job for Backups
+
+To enable the database dump job, set the following variables (*config file example*)
+
+```yaml
+backups:
+ path: "backups" # where backup will be saved
+ cron: "*/30 * * * *" # frequency of the cronjob
+ enabled: true
+```
diff --git a/docs/canary-token.md b/docs/canary-token.md
new file mode 100644
index 0000000..6e6c314
--- /dev/null
+++ b/docs/canary-token.md
@@ -0,0 +1,10 @@
+# Customizing the Canary Token
+
+To create a custom canary token, visit https://canarytokens.org
+
+and generate a "Web bug" canary token.
+
+This optional token is triggered when a crawler fully traverses the webpage until it reaches 0. At that point, a URL is returned. When this URL is requested, it sends an alert to the user via email, including the visitor's IP address and user agent.
+
+
+To enable this feature, set the canary token URL [using the environment variable](../README.md#configuration-via-enviromental-variables) `KRAWL_CANARY_TOKEN_URL`.
diff --git a/docs/dashboard.md b/docs/dashboard.md
new file mode 100644
index 0000000..ace7955
--- /dev/null
+++ b/docs/dashboard.md
@@ -0,0 +1,21 @@
+# Dashboard
+
+Access the dashboard at `http://:/`
+
+The dashboard shows:
+- Total and unique accesses
+- Suspicious activity and attack detection
+- Top IPs, paths, user-agents and GeoIP localization
+- Real-time monitoring
+
+The attackers' access to the honeypot endpoint and related suspicious activities (such as failed login attempts) are logged.
+
+Krawl also implements a scoring system designed to distinguish between malicious and legitimate behavior on the website.
+
+
+
+The top IP Addresses is shown along with top paths and User Agents
+
+
+
+
diff --git a/docs/honeypot.md b/docs/honeypot.md
new file mode 100644
index 0000000..6baffab
--- /dev/null
+++ b/docs/honeypot.md
@@ -0,0 +1,52 @@
+# Honeypot
+
+Below is a complete overview of the Krawl honeypot's capabilities
+
+## robots.txt
+The actual (juicy) robots.txt configuration [is the following](../src/templates/html/robots.txt).
+
+## Honeypot pages
+
+### Common Login Attempts
+Requests to common admin endpoints (`/admin/`, `/wp-admin/`, `/phpMyAdmin/`) return a fake login page. Any login attempt triggers a 1-second delay to simulate real processing and is fully logged in the dashboard (credentials, IP, headers, timing).
+
+
+
+### Common Misconfiguration Paths
+Requests to paths like `/backup/`, `/config/`, `/database/`, `/private/`, or `/uploads/` return a fake directory listing populated with "interesting" files, each assigned a random file size to look realistic.
+
+
+
+### Environment File Leakage
+The `.env` endpoint exposes fake database connection strings, **AWS API keys**, and **Stripe secrets**. It intentionally returns an error due to the `Content-Type` being `application/json` instead of plain text, mimicking a "juicy" misconfiguration that crawlers and scanners often flag as information leakage.
+
+### Server Error Information
+The `/server` page displays randomly generated fake error information for each known server.
+
+
+
+### API Endpoints with Sensitive Data
+The pages `/api/v1/users` and `/api/v2/secrets` show fake users and random secrets in JSON format
+
+
+
+### Exposed Credential Files
+The pages `/credentials.txt` and `/passwords.txt` show fake users and random secrets
+
+
+
+### SQL Injection and XSS Detection
+Pages such as `/users`, `/search`, `/contact`, `/info`, `/input`, and `/feedback`, along with APIs like `/api/sql` and `/api/database`, are designed to lure attackers into performing attacks such as **SQL injection** or **XSS**.
+
+
+
+Automated tools like **SQLMap** will receive a different randomized database error on each request, increasing scan noise and confusing the attacker. All detected attacks are logged and displayed in the dashboard.
+
+### Path Traversal Detection
+Krawl detects and responds to **path traversal** attempts targeting common system files like `/etc/passwd`, `/etc/shadow`, or Windows system paths. When an attacker tries to access sensitive files using patterns like `../../../etc/passwd` or encoded variants (`%2e%2e/`, `%252e`), Krawl returns convincing fake file contents with realistic system users, UIDs, GIDs, and shell configurations. This wastes attacker time while logging the full attack pattern.
+
+### XXE (XML External Entity) Injection
+The `/api/xml` and `/api/parser` endpoints accept XML input and are designed to detect **XXE injection** attempts. When attackers try to exploit external entity declarations (`