fix: DeepSeek niche/type not saving to DB
Two bugs: 1. _parse_classify_output stripped <think> block before searching for JSON. DeepSeek-R1 often puts the JSON array inside the think block (especially when it "decides" mid-reasoning), so stripping it first destroyed the data. Fix: search full output first, then inside <think>, then stripped — three fallback strategies with info logging at each step. 2. Phase 2 save used bare UPDATE WHERE domain=? which silently does nothing if the domain row doesn't exist yet in enriched_domains. Fix: replace with INSERT ... ON CONFLICT DO UPDATE (true upsert). Also adds logger.info lines so container logs show raw DeepSeek output and parse result count for easy debugging. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
10
app/db.py
10
app/db.py
@@ -436,10 +436,14 @@ async def save_prescreen_results(results: list[dict]):
|
||||
niche = r.get("niche")
|
||||
site_type = r.get("type") # DeepSeek returns "type" key
|
||||
if niche or site_type:
|
||||
# Classification-only update (domain row must already exist)
|
||||
# Upsert niche/type — works even if the row was never enriched
|
||||
await db.execute(
|
||||
"UPDATE enriched_domains SET niche=?, site_type=? WHERE domain=?",
|
||||
(niche, site_type, domain),
|
||||
"""INSERT INTO enriched_domains (domain, niche, site_type)
|
||||
VALUES (?, ?, ?)
|
||||
ON CONFLICT(domain) DO UPDATE SET
|
||||
niche=excluded.niche,
|
||||
site_type=excluded.site_type""",
|
||||
(domain, niche, site_type),
|
||||
)
|
||||
else:
|
||||
# Prescreen status upsert — create row if it doesn't exist yet
|
||||
|
||||
Reference in New Issue
Block a user