feat: initial Dockerized domain intelligence dashboard

- FastAPI backend with DuckDB pushdown queries on 72M parquet
- Async enrichment worker: HTTP, SSL, DNS MX, CMS fingerprint, ip-api.com
- Resumable parquet download with HTTP Range support
- Lead scoring engine (max 100 pts, target countries ES,GB,DE,FR,RO,PT,AD,IT)
- Single-file Alpine.js + Chart.js dashboard on port 6677
- SQLite enrichment DB with job queue and scores tables
- Dockerized with persistent /data volume

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-04-13 16:22:30 +02:00
commit b2e7a2f2db
11 changed files with 1467 additions and 0 deletions

54
README.md Normal file
View File

@@ -0,0 +1,54 @@
# DomGod — Domain Intelligence Dashboard
Dockerized dashboard for filtering, enriching, scoring, and exporting leads from a 72M-domain dataset.
## Quick start
```bash
docker compose up --build
```
Open **http://localhost:6677**
On first boot, the container downloads `domains.parquet` (~GB) and caches it in `./data/`. Subsequent restarts skip the download.
## Environment variables (docker-compose.yml)
| Variable | Default | Description |
|---|---|---|
| `DATA_DIR` | `/data` | Where parquet + sqlite live |
| `PARQUET_URL` | GitHub raw URL | Source parquet |
| `CONCURRENCY_LIMIT` | `50` | Parallel enrichment workers |
| `SCORE_THRESHOLD` | `60` | "Hot lead" threshold |
| `TARGET_TLDS` | `es,com,net` | TLDs to prioritise |
| `TARGET_COUNTRIES` | `ES,GB,DE,FR,RO,PT,AD,IT` | Countries for scoring bonus |
## Scoring
| Signal | Points |
|---|---|
| Domain is live | +20 |
| SSL expiry < 30 days | +15 |
| No valid SSL | +15 |
| Known CMS detected | +15 |
| No MX record | +10 |
| IP in target country | +10 |
| Shared hosting server | +10 |
| Local business keywords in title | +5 |
Max score: 100. Hot ≥ 80, Warm 5079, Cold < 50.
## API
```
GET /api/stats
GET /api/domains?tld=es&page=1&limit=100&live_only=false
POST /api/enrich/batch { "domains": ["example.com"] }
GET /api/enrich/status
POST /api/enrich/pause
POST /api/enrich/resume
POST /api/enrich/retry
GET /api/enriched?min_score=60&cms=wordpress&country=ES
GET /api/export?tier=hot (streams CSV)
POST /api/score/run
```