MTZ changed their Excel format — some rows have fewer than 10 columns,
causing IndexError on row[9] (brand, col J). Use conditional indexing for
cols I (8), J (9), K (10) so short rows parse with empty brand/type/min_box
instead of crashing on startup.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Skip products where description contains '*' after the (NNNNN) code
(e.g. "France*") — MTZ's exclusion marker
- Add 'type' field (col G, numeric prefix stripped) to API response
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Removes trailing " (NNNNN) - Country - Npcs ByBox" pattern from
product descriptions before serving to BC.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Helps diagnose whether the product cap is from EAN filtering or a downstream limit.
health and refresh now return: product_count, rows_processed, rows_skipped.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
openpyxl in read_only mode stops iterating at the sheet's cached <dimension ref>
attribute in the XML. If MTZ extended the Excel beyond the original row range,
those rows were silently ignored (hence always ~4000 products regardless of the
real count). Removing read_only=True forces openpyxl to read all actual data rows.
The file is already in BytesIO so there is no I/O penalty.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>