60c9b495ae715b97064c3bae59f5ca70e853b654
AI worker fixes (root cause of "nothing reaches Replicate"): - Worker task died silently — no exception handler around while loop - Added try/except around entire loop body with exc_info logging - Added watchdog task that restarts dead workers every 10 seconds - ensure_workers_alive() called on every /api/ai/assess/batch POST - _assess_one() is now a top-level function (not closure) — avoids subtle scoping bugs with async inner functions in while loops - /api/ai/debug endpoint: shows worker alive status, task exception, last 10 queue entries — browse to /api/ai/debug to diagnose - /api/ai/worker/restart endpoint + UI button - "Restart AI worker" button + "Debug AI queue" link in enrichment tab site_analyzer.py — new signals: - IP resolution + ip-api.com for ASN, org, ISP, host country - EU hosting detection (27 EU + EEA + adequacy countries) - GDPR: detects Cookiebot, OneTrust, CookiePro, Osano, Iubenda, Borlabs, CookieYes, Complianz, Usercentrics + text signals - Privacy policy and GDPR text presence - Accessibility: html lang missing, images without alt count, skip nav link, empty links, inputs without labels Gemini prompt additions: - Hosting section: IP, ASN, org/ISP, EU vs non-EU flag - GDPR section: cookie tool, notice, privacy policy - Accessibility section: all quick-scan results - New output fields: hosting_notes, gdpr_compliance, accessibility_issues[] Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
DomGod — Domain Intelligence Dashboard
Dockerized dashboard for filtering, enriching, scoring, and exporting leads from a 72M-domain dataset.
Quick start
docker compose up --build
On first boot, the container downloads domains.parquet (~GB) and caches it in ./data/. Subsequent restarts skip the download.
Environment variables (docker-compose.yml)
| Variable | Default | Description |
|---|---|---|
DATA_DIR |
/data |
Where parquet + sqlite live |
PARQUET_URL |
GitHub raw URL | Source parquet |
CONCURRENCY_LIMIT |
50 |
Parallel enrichment workers |
SCORE_THRESHOLD |
60 |
"Hot lead" threshold |
TARGET_TLDS |
es,com,net |
TLDs to prioritise |
TARGET_COUNTRIES |
ES,GB,DE,FR,RO,PT,AD,IT |
Countries for scoring bonus |
Scoring
| Signal | Points |
|---|---|
| Domain is live | +20 |
| SSL expiry < 30 days | +15 |
| No valid SSL | +15 |
| Known CMS detected | +15 |
| No MX record | +10 |
| IP in target country | +10 |
| Shared hosting server | +10 |
| Local business keywords in title | +5 |
Max score: 100. Hot ≥ 80, Warm 50–79, Cold < 50.
API
GET /api/stats
GET /api/domains?tld=es&page=1&limit=100&live_only=false
POST /api/enrich/batch { "domains": ["example.com"] }
GET /api/enrich/status
POST /api/enrich/pause
POST /api/enrich/resume
POST /api/enrich/retry
GET /api/enriched?min_score=60&cms=wordpress&country=ES
GET /api/export?tier=hot (streams CSV)
POST /api/score/run
Description
Languages
Python
60.6%
HTML
39.3%
Dockerfile
0.1%