DomGod/app/site_analyzer.py at e5abd22d34863b87de0bdb6ad9b50cf8eb5fac02

Files

Malin 3a7ef19746 feat: Cloudflare JS challenge bypass via playwright fallback

When httpx gets a CF challenge page (detected by title + small page size),
site_analyzer retries the fetch with headless Chromium via playwright,
waits 3s for the challenge to resolve, then proceeds with normal extraction.
Tested on productospeluqueriabellezaaura.com.es — now extracts real title,
email, phone, and Instagram/Facebook/TikTok links that were previously blocked.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-05-13 10:49:45 +02:00

27 KiB

Raw Blame History

View Raw

27 KiB Raw Blame History

27 KiB

Raw Blame History