Commit Graph

6 Commits

Author SHA1 Message Date
6657e6ea1f fix: rotate UA + treat any HTTP response as live (not just 200/203)
- Rotate across 7 real browser UAs to avoid bot detection
- Any 2xx/3xx/4xx/5xx response = server is UP = live (only no-response = dead)
- Parking signals still checked on 200/203 body content
- Previous 403/404 responses were incorrectly marking live servers as dead

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 18:36:32 +02:00
8a4ec88d73 fix: always fallback to https on any http failure (fixes HTTPS-only sites marked dead)
Previous fix only retried on ConnectError. Servers that accept TCP on port 80
but hang, return protocol errors, or timeout also need the https fallback.
Now any exception on http triggers https retry. Shorter http timeout (4s)
avoids wasting time on non-responsive port 80.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 17:43:44 +02:00
f8ab910eca feat: add rescan dead domains checkbox to validator
Adds rescan_dead flag that causes _filter_unvalidated to treat
previously-dead domains as needing a fresh check. Useful after
fixing the http/https detection bug.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 20:12:59 +02:00
ae2fad0152 fix: try https fallback when http port 80 is closed (fixes HTTPS-only domains marked as dead)
Many modern servers refuse HTTP connections entirely. The validator was
only trying http://, causing HTTPS-only sites to be wrongly marked dead.
Now falls back to https:// on ConnectError. Also increased timeouts slightly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 20:11:00 +02:00
3f042196d3 fix: always reset validator offset on start (fixes wrong TLD resuming previous offset)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 17:29:07 +02:00
8f387cada2 feat: bulk validator tab + status/niche/type browse filters
- New app/validator.py: background HTTP checker for entire dataset
  - 50 concurrent checks, skips already-validated domains
  - Extracts prescreen_status, server, IP, load_time_ms
  - start/stop/status API at /api/validator/start|stop|status

- New dedicated "Validator 🔬" tab with stats grid, TLD filter,
  Start/Stop controls, live progress indicator

- Browse tab: "Live" column replaced with "Status" dot (color-coded
  ● from prescreen_status, falls back to is_live)
- Browse tab: new Status / Niche / Type filter dropdowns

- db.py: added ip TEXT + load_time_ms INTEGER columns + migrations;
  get_enriched() supports prescreen_status/niche/site_type filters

- main.py: /api/enriched extended with prescreen_status/niche/site_type

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 08:27:24 +02:00