8 Commits

Author SHA1 Message Date
d9ece58e12 fix: search race condition + brand detection + contacts + reassess
- loadDomains(): add generation counter so stale auto-advance fetches
  cannot overwrite a newer user-triggered search result; snapshot filter
  state before the first await so URL reflects what was requested; add
  HTTP status check so backend errors surface as toasts rather than
  silent empty results; auto-advance now calls loadDomains() without
  await so the counter increments correctly per page advance

- beauty_ai: word-boundary regex for short brands (≤5 chars) to stop
  'ref' matching 'reference'/'refresh'/'prefer' etc.; merge phones,
  whatsapp and social_links from site_analyzer directly into result
  (more reliable than AI extraction); add contact_whatsapp and
  contact_social fields to AI JSON schema

- db: add requeue_beauty() for re-assessing already-assessed domains

- beauty_main: /api/beauty/reassess/batch endpoint using requeue_beauty

- index.html: Re-assess Selected bulk button, per-row ↺ button in
  Browse and Pipeline, WhatsApp + social links in Pipeline contact panel

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-07 11:06:58 +02:00
be0fbb502c fix: raise /api/domains limit cap from 500 to 5000
FastAPI was returning 422 for limit=5000, causing d.results to be
undefined and the table to show 0 results. Matches /api/enriched cap.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 09:26:50 +02:00
788252e14f feat: assessed filter, 5000 per-page limit, auto-advance on empty Not-checked page
Assessed/Not assessed filter:
- 'yes' → beauty_lead_quality IS NOT NULL (has been B2B assessed)
- 'no'  → beauty_lead_quality IS NULL (never assessed)
- wired through /api/enriched → get_enriched(beauty_assessed=)

Per-page limit:
- options: 100 / 500 / 1000 / 2000 / 5000
- backend cap raised from le=1000 to le=5000

Auto-advance on empty Not-checked page:
- after bulk validate/prescreen, loadDomains reloads the same DuckDB page
- if every domain on that page is now processed (client-side filter → 0 rows)
  but the page still returned results, automatically increment page and retry
- prevents "No domains found" after successfully processing a batch
- capped at page 500 to avoid infinite loop

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 09:19:51 +02:00
2f0959b8e8 fix: smart routing in Browse — enrichment filters use /api/enriched, discovery uses /api/domains
Root cause: loadDomains() always hit /api/domains (DuckDB 72M rows) and filtered
niche/site_type/prescreen_status client-side on a random page of 100 domains —
virtually none had been classified, so Live+Beauty+Ecommerce always returned 0.

- loadDomains() now routes to /api/enriched when any enrichment filter is active
  (prescreen_status, niche, site_type, country) — all filters are server-side SQLite
- Falls back to /api/domains only when no enrichment filters are set (discovery mode)
- alpha_only and no_sld supported in both modes:
  - DuckDB: existing regex support
  - SQLite: LIKE patterns (no hyphens/digits) + dot-count (no SLD)
- Add alpha_only/no_sld params to /api/enriched endpoint and get_enriched()
- Fix stale d.classified reference in prescreenOne toast

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 08:53:54 +02:00
daccb99a0c fix: prescreen returns immediately after HTTP check, DeepSeek runs in background
Previously /api/prescreen/batch blocked for 4-10 minutes waiting for Replicate/
DeepSeek, causing browser connection timeout and zero results saved.

- Phase 1 (HTTP check) runs synchronously and saves results immediately
- Phase 2 (DeepSeek classify) fires as asyncio.create_task and runs in background
- Response is returned to client as soon as phase 1 completes (~30-90s)
- Frontend toast shows "classifying N in background" so user knows niche/type
  will appear shortly without waiting
- Each DeepSeek sub-batch saves independently so partial results are preserved

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 08:28:26 +02:00
7ec0304dea feat: add Validate Selected button, Alpha only and No SLD filters to beauty Browse
- /api/validate/batch endpoint: HTTP-check only (no DeepSeek), accepts up to 500 domains
- Validate Selected bulk button: runs validate in 500-domain chunks, shows live/dead summary
- Alpha only checkbox: passes alpha_only=true to /api/domains to exclude hyphens/numbers
- No SLD checkbox: passes no_sld=true to /api/domains to skip com.es / co.uk style domains
- Both flags wired into loadDomains() and resetFilters()

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 07:59:32 +02:00
ad03107f0d fix: beauty frontend server-side filtering and bulk actions
- add keyword and tld params to get_enriched() in db.py (LIKE on domain + page_title)
- forward keyword/tld through /api/enriched in beauty_main.py
- rewrite beauty/index.html loadDomains() to pass all filters server-side via URLSearchParams
- track domainsTotal from API response for correct pagination display
- add Pre-screen Selected and B2B Assess Selected bulk action buttons
- add per-row Screen and Assess buttons
- goSearch() resets to page 1 before fetching

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 19:44:34 +02:00
a7dd7927b9 feat: BeautyLeads B2B cosmetics frontend on port 7788
New service (app/beauty_main.py) sharing the same /data volume:
- Separate FastAPI app running on port 7788
- beauty_ai.py: brand universe scan (~650 brands), portfolio match
  detection against OUR_BRANDS, Gemini B2B assessment prompt in Spanish
  returning quality/categories/dist_matches/outreach_email
- beauty_queue table + beauty_lead_quality/beauty_assessment columns
  in enriched_domains (with migrations)
- Endpoints: /api/beauty/assess/batch, /api/beauty/leads,
  /api/beauty/status, /api/beauty/export, /api/beauty/reset
- Static frontend: Browse (beauty/ecommerce pre-filtered, no CMS/SSL/KD
  columns), Validator, B2B Pipeline (brand chips, expandable outreach),
  Pre-screen, Export CSV
- docker-compose: second 'beauty' service with shared data volume
- Dockerfile: expose 7788 alongside 6677

Also: add 'error' prescreen_status handling + UI (orange stat box,
filter option) for 4xx/5xx HTTP responses

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 19:31:10 +02:00