Commit Graph

10 Commits

Author SHA1 Message Date
db95876db2 fix: SQLite database locked errors + add error status for 4xx/5xx
SQLite locking:
- Enable WAL journal mode in init_db (readers don't block writers)
- Set busy_timeout=30000ms in init_db
- Add timeout=30 to every aiosqlite.connect() across db.py, validator.py,
  enricher.py, main.py so connections wait up to 30s instead of crashing

Error status:
- 4xx/5xx HTTP responses are now prescreen_status='error' (server alive
  but broken/blocking) instead of 'live'
- Added 'error' counter to validator stats and orange Error stat box in UI
- Added ps-error CSS class (orange) and filter option in Browse tab

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 07:10:45 +02:00
989717e479 fix: always retry https on any http failure, unify timeouts
- Any http failure (ConnectError, ConnectTimeout, ReadTimeout, TLS error
  inside a redirect chain) now falls through to an https retry instead
  of immediately marking dead.
- Use one unified 8s connect / 12s read timeout for both schemes so that
  http→https redirects followed inside the same client get a full TLS
  handshake window (previously 4s http timeout was too short for the
  https redirect hop).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 19:50:56 +02:00
54c781773d fix: retry https on ConnectTimeout, not just ConnectError
Port 80 is often firewalled (drops packets → ConnectTimeout) rather than
refused (ConnectError). Previously ConnectTimeout hit the generic except
branch and broke without trying https, marking everything dead.

Now ConnectError + RemoteProtocolError + ConnectTimeout all trigger an
https retry. ReadTimeout still marks dead (server responded on connect
but was too slow).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 19:42:10 +02:00
b53545b7dd fix: bind exception variable in ConnectError handler to prevent NameError
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 18:48:04 +02:00
6657e6ea1f fix: rotate UA + treat any HTTP response as live (not just 200/203)
- Rotate across 7 real browser UAs to avoid bot detection
- Any 2xx/3xx/4xx/5xx response = server is UP = live (only no-response = dead)
- Parking signals still checked on 200/203 body content
- Previous 403/404 responses were incorrectly marking live servers as dead

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 18:36:32 +02:00
8a4ec88d73 fix: always fallback to https on any http failure (fixes HTTPS-only sites marked dead)
Previous fix only retried on ConnectError. Servers that accept TCP on port 80
but hang, return protocol errors, or timeout also need the https fallback.
Now any exception on http triggers https retry. Shorter http timeout (4s)
avoids wasting time on non-responsive port 80.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 17:43:44 +02:00
f8ab910eca feat: add rescan dead domains checkbox to validator
Adds rescan_dead flag that causes _filter_unvalidated to treat
previously-dead domains as needing a fresh check. Useful after
fixing the http/https detection bug.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 20:12:59 +02:00
ae2fad0152 fix: try https fallback when http port 80 is closed (fixes HTTPS-only domains marked as dead)
Many modern servers refuse HTTP connections entirely. The validator was
only trying http://, causing HTTPS-only sites to be wrongly marked dead.
Now falls back to https:// on ConnectError. Also increased timeouts slightly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 20:11:00 +02:00
3f042196d3 fix: always reset validator offset on start (fixes wrong TLD resuming previous offset)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 17:29:07 +02:00
8f387cada2 feat: bulk validator tab + status/niche/type browse filters
- New app/validator.py: background HTTP checker for entire dataset
  - 50 concurrent checks, skips already-validated domains
  - Extracts prescreen_status, server, IP, load_time_ms
  - start/stop/status API at /api/validator/start|stop|status

- New dedicated "Validator 🔬" tab with stats grid, TLD filter,
  Start/Stop controls, live progress indicator

- Browse tab: "Live" column replaced with "Status" dot (color-coded
  ● from prescreen_status, falls back to is_live)
- Browse tab: new Status / Niche / Type filter dropdowns

- db.py: added ip TEXT + load_time_ms INTEGER columns + migrations;
  get_enriched() supports prescreen_status/niche/site_type filters

- main.py: /api/enriched extended with prescreen_status/niche/site_type

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 08:27:24 +02:00