When httpx gets a CF challenge page (detected by title + small page size), site_analyzer retries the fetch with headless Chromium via playwright, waits 3s for the challenge to resolve, then proceeds with normal extraction. Tested on productospeluqueriabellezaaura.com.es — now extracts real title, email, phone, and Instagram/Facebook/TikTok links that were previously blocked. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
27 KiB
27 KiB