Port TEA commands into workflows and preload Murat knowledge (#660)

* Port TEA commands into workflows and preload Murat knowledge * Broke the giant knowledge dump into curated fragments under src/modules/bmm/testarch/knowledge/ * Broke the giant knowledge dump into curated fragments under src/modules/bmm/testarch/knowledge/ * updated the web bunles for tea, and spot updates for analyst and sm * Replaced the old TEA brief with an indexed knowledge system: the agent now loads topic-specific docs from knowledge/ via tea-index.csv, workflows reference those fragments, and risk/level/ priority guidance lives in the new fragment files --------- Co-authored-by: Murat Ozcan <murat@mac.lan>
2025-12-29 16:14:59 +00:00 · 2025-09-30 15:19:55 -05:00
parent 30fb0e67e1
commit df0c3e4bae
51 changed files with 1139 additions and 901 deletions
--- a/src/modules/bmm/testarch/README.md
+++ b/src/modules/bmm/testarch/README.md
@@ -18,10 +18,8 @@ last-redoc-date: 2025-09-30
   - Architect `*solution-architecture`
 2. Confirm `bmad/bmm/config.yaml` defines `project_name`, `output_folder`, `dev_story_location`, and language settings.
 3. Ensure a test test framework setup exists; if not, use `*framework` command to create a test framework setup, prior to development.
-4. Skim supporting references under `./testarch/`:
-   - `tea-knowledge.md`
-   - `test-levels-framework.md`
-   - `test-priorities-matrix.md`
+4. Skim supporting references (knowledge under `testarch/`, command workflows under `workflows/testarch/`).
+   - `tea-index.csv` + `knowledge/*.md`

 ## High-Level Cheat Sheets

@@ -125,31 +123,40 @@ last-redoc-date: 2025-09-30

 ## Command Catalog

-| Command        | Task File                        | Primary Outputs                                                      | Notes                                            |
-| -------------- | -------------------------------- | -------------------------------------------------------------------- | ------------------------------------------------ |
-| `*framework`   | `testarch/framework.md`          | Playwright/Cypress scaffold, `.env.example`, `.nvmrc`, sample specs  | Use when no production-ready harness exists      |
-| `*atdd`        | `testarch/atdd.md`               | Failing Acceptance-Test Driven Development, implementation checklist | Requires approved story + harness                |
-| `*automate`    | `testarch/automate.md`           | Prioritized specs, fixtures, README/script updates, DoD summary      | Avoid duplicate coverage (see priority matrix)   |
-| `*ci`          | `testarch/ci.md`                 | CI workflow, selective test scripts, secrets checklist               | Platform-aware (GitHub Actions default)          |
-| `*test-design` | `testarch/test-design.md`        | Combined risk assessment, mitigation plan, and coverage strategy     | Handles risk scoring and test design in one pass |
-| `*trace`       | `testarch/trace-requirements.md` | Coverage matrix, recommendations, gate snippet                       | Requires access to story/tests repositories      |
-| `*nfr-assess`  | `testarch/nfr-assess.md`         | NFR assessment report with actions                                   | Focus on security/performance/reliability        |
-| `*gate`        | `testarch/gate.md`               | Gate YAML + summary (PASS/CONCERNS/FAIL/WAIVED)                      | Deterministic decision rules + rationale         |
+| Command        | Task File                                        | Primary Outputs                                                     | Notes                                            |
+| -------------- | ------------------------------------------------ | ------------------------------------------------------------------- | ------------------------------------------------ |
+| `*framework`   | `workflows/testarch/framework/instructions.md`   | Playwright/Cypress scaffold, `.env.example`, `.nvmrc`, sample specs | Use when no production-ready harness exists      |
+| `*atdd`        | `workflows/testarch/atdd/instructions.md`        | Failing acceptance tests + implementation checklist                 | Requires approved story + harness                |
+| `*automate`    | `workflows/testarch/automate/instructions.md`    | Prioritized specs, fixtures, README/script updates, DoD summary     | Avoid duplicate coverage (see priority matrix)   |
+| `*ci`          | `workflows/testarch/ci/instructions.md`          | CI workflow, selective test scripts, secrets checklist              | Platform-aware (GitHub Actions default)          |
+| `*test-design` | `workflows/testarch/test-design/instructions.md` | Combined risk assessment, mitigation plan, and coverage strategy    | Handles risk scoring and test design in one pass |
+| `*trace`       | `workflows/testarch/trace/instructions.md`       | Coverage matrix, recommendations, gate snippet                      | Requires access to story/tests repositories      |
+| `*nfr-assess`  | `workflows/testarch/nfr-assess/instructions.md`  | NFR assessment report with actions                                  | Focus on security/performance/reliability        |
+| `*gate`        | `workflows/testarch/gate/instructions.md`        | Gate YAML + summary (PASS/CONCERNS/FAIL/WAIVED)                     | Deterministic decision rules + rationale         |

 <details>
 <summary>Command Guidance and Context Loading</summary>

- Each task reads one row from `tea-commands.csv` via `command_key`, expanding pipe-delimited (`|`) values into checklists.
- Keep CSV rows lightweight; place in-depth heuristics in `tea-knowledge.md` and reference via `knowledge_tags`.
- If the CSV grows substantially, consider splitting into scoped registries (e.g., planning vs execution) or upgrading to Markdown tables for humans.
- `tea-knowledge.md` encapsulates Murat’s philosophy—update both CSV and knowledge file together to avoid drift.
+- Each task now carries its own preflight/flow/deliverable guidance inline.
+- `tea-index.csv` maps workflow needs to knowledge fragments; keep tags accurate as you add guidance.
+- Consider future modularization into orchestrated workflows if additional automation is needed.
+- Update the fragment markdown files alongside workflow edits so guidance and outputs stay in sync.

 </details>

+## Workflow Placement
+
+The TEA stack has three tightly-linked layers:
+
+1. **Agent spec (`agents/tea.md`)** – declares the persona, critical actions, and the `run-workflow` entries for every TEA command. Critical actions instruct the agent to load `tea-index.csv` and then fetch only the fragments it needs from `knowledge/` before giving guidance.
+2. **Knowledge index (`tea-index.csv`)** – catalogues each fragment with tags and file paths. Workflows call out the IDs they need (e.g., `risk-governance`, `fixture-architecture`) so the agent loads targeted guidance instead of a monolithic brief.
+3. **Workflows (`workflows/testarch/*`)** – contain the task flows and reference `tea-index.csv` in their `<flow>`/`<notes>` sections to request specific fragments. Keeping all workflows in this directory ensures consistent discovery during planning (`*framework`), implementation (`*atdd`, `*automate`, `*trace`), and release (`*nfr-assess`, `*gate`).
+
+This separation lets us expand the knowledge base without touching agent wiring and keeps every command remote-controllable via the standard BMAD workflow runner. As navigation improves, we can add lightweight entrypoints or tags in the index without changing where workflows live.
+
 ## Appendix

 - **Supporting Knowledge:**
-  - `tea-knowledge.md` – Murat’s testing philosophy, heuristics, and risk scales.
-  - `test-levels-framework.md` – Decision matrix for unit/integration/E2E selection.
-  - `test-priorities-matrix.md` – Priority (P0–P3) criteria and target coverage percentages.
-    s
+  - `tea-index.csv` – Catalog of knowledge fragments with tags and file paths under `knowledge/` for task-specific loading.
+  - `knowledge/*.md` – Focused summaries (fixtures, network, CI, levels, priorities, etc.) distilled from Murat’s external resources.
+  - `test-resources-for-ai-flat.txt` – Raw 347 KB archive retained for manual deep dives when a fragment needs source validation.
--- a/src/modules/bmm/testarch/atdd.md
+++ b/src/modules/bmm/testarch/atdd.md
@@ -1,40 +0,0 @@
-<!-- Powered by BMAD-CORE™ -->
-
-# Acceptance TDD v2.0 (Slim)
-
-```xml
-<task id="bmad/bmm/testarch/tdd" name="Acceptance Test Driven Development">
-  <llm critical="true">
-    <i>Set command_key="*tdd"</i>
-    <i>Load {project-root}/bmad/bmm/testarch/tea-commands.csv and parse the row where command equals command_key</i>
-    <i>Load {project-root}/bmad/bmm/testarch/tea-knowledge.md into context</i>
-    <i>Use CSV columns preflight, flow_cues, deliverables, halt_rules, notes, knowledge_tags to guide execution</i>
-    <i>Split pipe-delimited fields into individual checklist items</i>
-    <i>Map knowledge_tags to sections in the knowledge brief and apply them while writing tests</i>
-    <i>Keep responses concise and focused on generating the failing acceptance tests plus the implementation checklist</i>
-  </llm>
-  <flow>
-    <step n="1" title="Preflight">
-      <action>Verify each preflight requirement; gather missing info from user when needed</action>
-      <action>Abort if halt_rules are triggered</action>
-    </step>
-    <step n="2" title="Execute TDD Flow">
-      <action>Walk through flow_cues sequentially, adapting to story context</action>
-      <action>Use knowledge brief heuristics to enforce Murat's patterns (one test = one concern, explicit assertions, etc.)</action>
-    </step>
-    <step n="3" title="Deliverables">
-      <action>Produce artifacts described in deliverables</action>
-      <action>Summarize failing tests and checklist items for the developer</action>
-    </step>
-  </flow>
-  <halt>
-    <i>Apply halt_rules from the CSV row exactly</i>
-  </halt>
-  <notes>
-    <i>Use the notes column for additional constraints or reminders</i>
-  </notes>
-  <output>
-    <i>Failing acceptance test files + implementation checklist summary</i>
-  </output>
-</task>
-```
--- a/src/modules/bmm/testarch/automate.md
+++ b/src/modules/bmm/testarch/automate.md
@@ -1,38 +0,0 @@
-<!-- Powered by BMAD-CORE™ -->
-
-# Automation Expansion v2.0 (Slim)
-
-```xml
-<task id="bmad/bmm/testarch/automate" name="Automation Expansion">
-  <llm critical="true">
-    <i>Set command_key="*automate"</i>
-    <i>Load {project-root}/bmad/bmm/testarch/tea-commands.csv and read the row where command equals command_key</i>
-    <i>Load {project-root}/bmad/bmm/testarch/tea-knowledge.md for heuristics</i>
-    <i>Follow CSV columns preflight, flow_cues, deliverables, halt_rules, notes, knowledge_tags</i>
-    <i>Convert pipe-delimited values into actionable checklists</i>
-    <i>Apply Murat's opinions from the knowledge brief when filling gaps or refactoring tests</i>
-  </llm>
-  <flow>
-    <step n="1" title="Preflight">
-      <action>Confirm prerequisites; stop if halt_rules are triggered</action>
-    </step>
-    <step n="2" title="Execute Automation Flow">
-      <action>Walk through flow_cues to analyse existing coverage and add only necessary specs</action>
-      <action>Use knowledge heuristics (composable helpers, deterministic waits, network boundary) while generating code</action>
-    </step>
-    <step n="3" title="Deliverables">
-      <action>Create or update artifacts listed in deliverables</action>
-      <action>Summarize coverage deltas and remaining recommendations</action>
-    </step>
-  </flow>
-  <halt>
-    <i>Apply halt_rules from the CSV row as written</i>
-  </halt>
-  <notes>
-    <i>Reference notes column for additional guardrails</i>
-  </notes>
-  <output>
-    <i>Updated spec files and concise summary of automation changes</i>
-  </output>
-</task>
-```
--- a/src/modules/bmm/testarch/ci.md
+++ b/src/modules/bmm/testarch/ci.md
@@ -1,39 +0,0 @@
-<!-- Powered by BMAD-CORE™ -->
-
-# CI/CD Enablement v2.0 (Slim)
-
-```xml
-<task id="bmad/bmm/testarch/ci" name="CI/CD Enablement">
-  <llm critical="true">
-    <i>Set command_key="*ci"</i>
-    <i>Load {project-root}/bmad/bmm/testarch/tea-commands.csv and read the row where command equals command_key</i>
-    <i>Load {project-root}/bmad/bmm/testarch/tea-knowledge.md to recall CI heuristics</i>
-    <i>Follow CSV columns preflight, flow_cues, deliverables, halt_rules, notes, knowledge_tags</i>
-    <i>Split pipe-delimited values into actionable lists</i>
-    <i>Keep output focused on workflow YAML, scripts, and guidance explicitly requested in deliverables</i>
-  </llm>
-  <flow>
-    <step n="1" title="Preflight">
-      <action>Confirm prerequisites and required permissions</action>
-      <action>Stop if halt_rules trigger</action>
-    </step>
-    <step n="2" title="Execute CI Flow">
-      <action>Apply flow_cues to design the pipeline stages</action>
-      <action>Leverage knowledge brief guidance (cost vs confidence, sharding, artifacts) when making trade-offs</action>
-    </step>
-    <step n="3" title="Deliverables">
-      <action>Create artifacts listed in deliverables (workflow files, scripts, documentation)</action>
-      <action>Summarize the pipeline, selective testing strategy, and required secrets</action>
-    </step>
-  </flow>
-  <halt>
-    <i>Use halt_rules from the CSV row verbatim</i>
-  </halt>
-  <notes>
-    <i>Reference notes column for optimization reminders</i>
-  </notes>
-  <output>
-    <i>CI workflow + concise explanation ready for team adoption</i>
-  </output>
-</task>
-```
--- a/src/modules/bmm/testarch/framework.md
+++ b/src/modules/bmm/testarch/framework.md
@@ -1,41 +0,0 @@
-<!-- Powered by BMAD-CORE™ -->
-
-# Test Framework Setup v2.0 (Slim)
-
-```xml
-<task id="bmad/bmm/testarch/framework" name="Test Framework Setup">
-  <llm critical="true">
-    <i>Set command_key="*framework"</i>
-    <i>Load {project-root}/bmad/bmm/testarch/tea-commands.csv and parse the row where command equals command_key</i>
-    <i>Load {project-root}/bmad/bmm/testarch/tea-knowledge.md to internal memory</i>
-    <i>Use the CSV columns preflight, flow_cues, deliverables, halt_rules, notes, knowledge_tags to guide behaviour</i>
-    <i>Split pipe-delimited values (|) into individual checklist items</i>
-    <i>Map knowledge_tags to matching sections in the knowledge brief and apply those heuristics throughout execution</i>
-    <i>DO NOT expand beyond the guidance unless the user supplies extra context; keep instructions lean and adaptive</i>
-  </llm>
-  <flow>
-    <step n="1" title="Run Preflight Checks">
-      <action>Evaluate each item in preflight; confirm or collect missing information</action>
-      <action>If any preflight requirement fails, follow halt_rules and stop</action>
-    </step>
-    <step n="2" title="Execute Framework Flow">
-      <action>Follow flow_cues sequence, adapting to the project's stack</action>
-      <action>When deciding frameworks or patterns, apply relevant heuristics from tea-knowledge.md via knowledge_tags</action>
-      <action>Keep generated assets minimal—only what the CSV specifies</action>
-    </step>
-    <step n="3" title="Finalize Deliverables">
-      <action>Create artifacts listed in deliverables</action>
-      <action>Capture a concise summary for the user explaining what was scaffolded</action>
-    </step>
-  </flow>
-  <halt>
-    <i>Follow halt_rules from the CSV row verbatim</i>
-  </halt>
-  <notes>
-    <i>Use notes column for additional guardrails while executing</i>
-  </notes>
-  <output>
-    <i>Deliverables and summary specified in the CSV row</i>
-  </output>
-</task>
-```
--- a/src/modules/bmm/testarch/gate.md
+++ b/src/modules/bmm/testarch/gate.md
@@ -1,38 +0,0 @@
-<!-- Powered by BMAD-CORE™ -->
-
-# Quality Gate v2.0 (Slim)
-
-```xml
-<task id="bmad/bmm/testarch/tea-gate" name="Quality Gate">
-  <llm critical="true">
-    <i>Set command_key="*gate"</i>
-    <i>Load {project-root}/bmad/bmm/testarch/tea-commands.csv and read the matching row</i>
-    <i>Load {project-root}/bmad/bmm/testarch/tea-knowledge.md to reinforce risk-model heuristics</i>
-    <i>Use CSV columns preflight, flow_cues, deliverables, halt_rules, notes, knowledge_tags</i>
-    <i>Split pipe-delimited values into actionable items</i>
-    <i>Apply deterministic rules for PASS/CONCERNS/FAIL/WAIVED; capture rationale and approvals</i>
-  </llm>
-  <flow>
-    <step n="1" title="Preflight">
-      <action>Gather latest assessments and confirm prerequisites; halt per halt_rules if missing</action>
-    </step>
-    <step n="2" title="Set Gate Decision">
-      <action>Follow flow_cues to determine status, residual risk, follow-ups</action>
-      <action>Use knowledge heuristics to balance cost vs confidence when negotiating waivers</action>
-    </step>
-    <step n="3" title="Deliverables">
-      <action>Update gate YAML specified in deliverables</action>
-      <action>Summarize decision, rationale, owners, and deadlines</action>
-    </step>
-  </flow>
-  <halt>
-    <i>Apply halt_rules from the CSV row</i>
-  </halt>
-  <notes>
-    <i>Use notes column for quality bar reminders</i>
-  </notes>
-  <output>
-    <i>Updated gate file with documented decision</i>
-  </output>
-</task>
-```
--- a/src/modules/bmm/testarch/knowledge/ci-burn-in.md
+++ b/src/modules/bmm/testarch/knowledge/ci-burn-in.md
@@ -0,0 +1,9 @@
+# CI Pipeline and Burn-In Strategy
+
+- Stage jobs: install/caching once, run `test-changed` for quick feedback, then shard full suites with `fail-fast: false` so evidence isn’t lost.
+- Re-run changed specs 5–10x (burn-in) before merging to flush flakes; fail the pipeline on the first inconsistent run.
+- Upload artifacts on failure (videos, traces, HAR) and keep retry counts explicit—hidden retries hide instability.
+- Use `wait-on` for app startup, enforce time budgets (<10 min per job), and document required secrets alongside workflows.
+- Mirror CI scripts locally (`npm run test:ci`, `scripts/burn-in-changed.sh`) so devs reproduce pipeline behaviour exactly.
+
+_Source: Murat CI/CD strategy blog, Playwright/Cypress workflow examples._
--- a/src/modules/bmm/testarch/knowledge/component-tdd.md
+++ b/src/modules/bmm/testarch/knowledge/component-tdd.md
@@ -0,0 +1,9 @@
+# Component Test-Driven Development Loop
+
+- Start every UI change with a failing component spec (`cy.mount` or RTL `render`); ship only after red → green → refactor passes.
+- Recreate providers/stores per spec to prevent state bleed and keep parallel runs deterministic.
+- Use factories to exercise prop/state permutations; cover accessibility by asserting against roles, labels, and keyboard flows.
+- Keep component specs under ~100 lines: split by intent (rendering, state transitions, error messaging) to preserve clarity.
+- Pair component tests with visual debugging (Cypress runner, Storybook, Playwright trace viewer) to accelerate diagnosis.
+
+_Source: CCTDD repository, Murat component testing talks._
--- a/src/modules/bmm/testarch/knowledge/contract-testing.md
+++ b/src/modules/bmm/testarch/knowledge/contract-testing.md
@@ -0,0 +1,9 @@
+# Contract Testing Essentials (Pact)
+
+- Store consumer contracts beside the integration specs that generate them; version contracts semantically and publish on every CI run.
+- Require provider verification before merge; failed verification blocks release and surfaces breaking changes immediately.
+- Capture fallback behaviour inside interactions (timeouts, retries, error payloads) so resilience guarantees remain explicit.
+- Automate broker housekeeping: tag releases, archive superseded contracts, and expire unused pacts to reduce noise.
+- Pair contract suites with API smoke or component tests to validate data mapping and UI rendering in tandem.
+
+_Source: Pact consumer/provider sample repos, Murat contract testing blog._
--- a/src/modules/bmm/testarch/knowledge/data-factories.md
+++ b/src/modules/bmm/testarch/knowledge/data-factories.md
@@ -0,0 +1,9 @@
+# Data Factories and API-First Setup
+
+- Prefer factory functions that accept overrides and return complete objects (`createUser(overrides)`)—never rely on static fixtures.
+- Seed state through APIs, tasks, or direct DB helpers before visiting the UI; UI-based setup is for validation only.
+- Ensure factories generate parallel-safe identifiers (UUIDs, timestamps) and perform cleanup after each test.
+- Centralize factory exports to avoid duplication; version them alongside schema changes to catch drift in reviews.
+- When working with shared environments, layer feature toggles or targeted cleanup so factories do not clobber concurrent runs.
+
+_Source: Murat Testing Philosophy, blog posts on functional helpers and API-first testing._
--- a/src/modules/bmm/testarch/knowledge/email-auth.md
+++ b/src/modules/bmm/testarch/knowledge/email-auth.md
@@ -0,0 +1,9 @@
+# Email-Based Authentication Testing
+
+- Use services like Mailosaur or in-house SMTP capture; extract magic links via regex or HTML parsing helpers.
+- Preserve browser storage (local/session) when processing links—restore state before visiting the authenticated page.
+- Cache email payloads with `cypress-data-session` or equivalent so retries don’t exhaust inbox quotas.
+- Cover negative cases: expired links, reused links, and multiple requests in rapid succession.
+- Ensure the workflow logs the email ID and link for troubleshooting, but scrub PII before committing artifacts.
+
+_Source: Email authentication blog, Murat testing toolkit._
--- a/src/modules/bmm/testarch/knowledge/error-handling.md
+++ b/src/modules/bmm/testarch/knowledge/error-handling.md
@@ -0,0 +1,9 @@
+# Error Handling and Resilience Checks
+
+- Treat expected failures explicitly: intercept network errors and assert UI fallbacks (`error-message` visible, retries triggered).
+- In Cypress, use scoped `Cypress.on('uncaught:exception')` to ignore known errors; rethrow anything else so regressions fail.
+- In Playwright, hook `page.on('pageerror')` and only swallow the specific, documented error messages.
+- Test retry/backoff logic by forcing sequential failures (e.g., 500, timeout, success) and asserting telemetry gets recorded.
+- Log captured errors with context (request payload, user/session) but redact secrets to keep artifacts safe for sharing.
+
+_Source: Murat error-handling patterns, Pact resilience guidance._
--- a/src/modules/bmm/testarch/knowledge/feature-flags.md
+++ b/src/modules/bmm/testarch/knowledge/feature-flags.md
@@ -0,0 +1,9 @@
+# Feature Flag Governance
+
+- Centralize flag definitions in a frozen enum; expose helpers to set, clear, and target specific audiences.
+- Test both enabled and disabled states in CI; clean up targeting after each spec to keep shared environments stable.
+- For LaunchDarkly-style systems, script API helpers to seed variations instead of mutating via UI.
+- Maintain a checklist for new flags: default state, owners, expiry date, telemetry, rollback plan.
+- Document flag dependencies in story/PR templates so QA and release reviews know which toggles must flip before launch.
+
+_Source: LaunchDarkly strategy blog, Murat test architecture notes._
--- a/src/modules/bmm/testarch/knowledge/fixture-architecture.md
+++ b/src/modules/bmm/testarch/knowledge/fixture-architecture.md
@@ -0,0 +1,9 @@
+# Fixture Architecture Playbook
+
+- Build helpers as pure functions first, then expose them via Playwright `extend` or Cypress commands so logic stays testable in isolation.
+- Compose capabilities with `mergeTests` (Playwright) or layered Cypress commands instead of inheritance; each fixture should solve one concern (auth, api, logs, network).
+- Keep HTTP helpers framework agnostic—accept all required params explicitly and return results so unit tests and runtime fixtures can share them.
+- Export fixtures through package subpaths (`"./api-request"`, `"./api-request/fixtures"`) to make reuse trivial across suites and projects.
+- Treat fixture files as infrastructure: document dependencies, enforce deterministic timeouts, and ban hidden retries that mask flakiness.
+
+_Source: Murat Testing Philosophy, cy-vs-pw comparison, SEON production patterns._
--- a/src/modules/bmm/testarch/knowledge/network-first.md
+++ b/src/modules/bmm/testarch/knowledge/network-first.md
@@ -0,0 +1,9 @@
+# Network-First Safeguards
+
+- Register interceptions before any navigation or user action; store the promise and await it immediately after the triggering step.
+- Assert on structured responses (status, body schema, headers) instead of generic waits so failures surface with actionable context.
+- Capture HAR files or Playwright traces on successful runs—reuse them for deterministic CI playback when upstream services flake.
+- Prefer edge mocking: stub at service boundaries, never deep within the stack unless risk analysis demands it.
+- Replace implicit waits with deterministic signals like `waitForResponse`, disappearance of spinners, or event hooks.
+
+_Source: Murat Testing Philosophy, Playwright patterns book, blog on network interception._
--- a/src/modules/bmm/testarch/knowledge/nfr-criteria.md
+++ b/src/modules/bmm/testarch/knowledge/nfr-criteria.md
@@ -0,0 +1,21 @@
+# Non-Functional Review Criteria
+
+- **Security**
+  - PASS: auth/authz, secret handling, and threat mitigations in place.
+  - CONCERNS: minor gaps with clear owners.
+  - FAIL: critical exposure or missing controls.
+- **Performance**
+  - PASS: metrics meet targets with profiling evidence.
+  - CONCERNS: trending toward limits or missing baselines.
+  - FAIL: breaches SLO/SLA or introduces resource leaks.
+- **Reliability**
+  - PASS: error handling, retries, health checks verified.
+  - CONCERNS: partial coverage or missing telemetry.
+  - FAIL: no recovery path or crash scenarios unresolved.
+- **Maintainability**
+  - PASS: clean code, tests, and documentation shipped together.
+  - CONCERNS: duplication, low coverage, or unclear ownership.
+  - FAIL: absent tests, tangled implementations, or no observability.
+- Default to CONCERNS when targets or evidence are undefined—force the team to clarify before sign-off.
+
+_Source: Murat NFR assessment guidance._
--- a/src/modules/bmm/testarch/knowledge/playwright-config.md
+++ b/src/modules/bmm/testarch/knowledge/playwright-config.md
@@ -0,0 +1,9 @@
+# Playwright Configuration Guardrails
+
+- Load environment configs via a central map (`envConfigMap`) and fail fast when `TEST_ENV` is missing or unsupported.
+- Standardize timeouts: action 15s, navigation 30s, expect 10s, test 60s; expose overrides through fixtures rather than inline literals.
+- Emit HTML + JUnit reporters, disable auto-open, and store artifacts under `test-results/` for CI upload.
+- Keep `.env.example`, `.nvmrc`, and browser dependencies versioned so local and CI runs stay aligned.
+- Use global setup for shared auth tokens or seeding, but prefer per-test fixtures for anything mutable to avoid cross-test leakage.
+
+_Source: Playwright book repo, SEON configuration example._
--- a/src/modules/bmm/testarch/knowledge/probability-impact.md
+++ b/src/modules/bmm/testarch/knowledge/probability-impact.md
@@ -0,0 +1,17 @@
+# Probability and Impact Scale
+
+- **Probability**
+  - 1 – Unlikely: standard implementation, low uncertainty.
+  - 2 – Possible: edge cases or partial unknowns worth investigation.
+  - 3 – Likely: known issues, new integrations, or high ambiguity.
+- **Impact**
+  - 1 – Minor: cosmetic issues or easy workarounds.
+  - 2 – Degraded: partial feature loss or manual workaround required.
+  - 3 – Critical: blockers, data/security/regulatory exposure.
+- Multiply probability × impact to derive the risk score.
+  - 1–3: document for awareness.
+  - 4–5: monitor closely, plan mitigations.
+  - 6–8: CONCERNS at the gate until mitigations are implemented.
+  - 9: automatic gate FAIL until resolved or formally waived.
+
+_Source: Murat risk model summary._
--- a/src/modules/bmm/testarch/knowledge/risk-governance.md
+++ b/src/modules/bmm/testarch/knowledge/risk-governance.md
@@ -0,0 +1,14 @@
+# Risk Governance and Gatekeeping
+
+- Score risk as probability (1–3) × impact (1–3); totals ≥6 demand mitigation before approval, 9 mandates a gate failure.
+- Classify risks across TECH, SEC, PERF, DATA, BUS, OPS. Document owners, mitigation plans, and deadlines for any score above 4.
+- Trace every acceptance criterion to implemented tests; missing coverage must be resolved or explicitly waived before release.
+- Gate decisions:
+  - **PASS** – no critical issues remain and evidence is current.
+  - **CONCERNS** – residual risk exists but has owners, actions, and timelines.
+  - **FAIL** – critical issues unresolved or evidence missing.
+  - **WAIVED** – risk accepted with documented approver, rationale, and expiry.
+- Maintain a gate history log capturing updates so auditors can follow the decision trail.
+- Use the probability/impact scale fragment for shared definitions when scoring teams run the matrix.
+
+_Source: Murat risk governance notes, gate schema guidance._
--- a/src/modules/bmm/testarch/knowledge/selective-testing.md
+++ b/src/modules/bmm/testarch/knowledge/selective-testing.md
@@ -0,0 +1,9 @@
+# Selective and Targeted Test Execution
+
+- Use tags/grep (`--grep "@smoke"`, `--grep "@critical"`) to slice suites by risk, not directory.
+- Filter by spec patterns (`--spec "**/*checkout*"`) or git diff (`npm run test:changed`) to focus on impacted areas.
+- Combine priority metadata (P0–P3) with change detection to decide which levels to run pre-commit vs. in CI.
+- Record burn-in history for newly added specs; promote to main suite only after consistent green runs.
+- Document the selection strategy in README/CI so the team understands when full regression is mandatory.
+
+_Source: 32+ selective testing strategies blog, Murat testing philosophy._
--- a/src/modules/bmm/testarch/knowledge/test-levels-framework.md
+++ b/src/modules/bmm/testarch/knowledge/test-levels-framework.md
--- a/src/modules/bmm/testarch/knowledge/test-priorities-matrix.md
+++ b/src/modules/bmm/testarch/knowledge/test-priorities-matrix.md
--- a/src/modules/bmm/testarch/knowledge/test-quality.md
+++ b/src/modules/bmm/testarch/knowledge/test-quality.md
@@ -0,0 +1,10 @@
+# Test Quality Definition of Done
+
+- No hard waits (`waitForTimeout`, `cy.wait(ms)`); rely on deterministic waits or event hooks.
+- Each spec <300 lines and executes in ≤1.5 minutes.
+- Tests are isolated, parallel-safe, and self-cleaning (seed via API/tasks, teardown after run).
+- Assertions stay visible in test bodies; avoid conditional logic controlling test flow.
+- Suites must pass locally and in CI with the same commands.
+- Promote new tests only after they have failed for the intended reason at least once.
+
+_Source: Murat quality checklist._
--- a/src/modules/bmm/testarch/knowledge/visual-debugging.md
+++ b/src/modules/bmm/testarch/knowledge/visual-debugging.md
@@ -0,0 +1,9 @@
+# Visual Debugging and Developer Ergonomics
+
+- Keep Playwright trace viewer, Cypress runner, and Storybook accessible in CI artifacts to speed up reproduction.
+- Record short screen captures only-on-failure; pair them with HAR or console logs to avoid guesswork.
+- Document common trace navigation steps (network tab, action timeline) so new contributors diagnose issues quickly.
+- Encourage live-debug sessions with component harnesses to validate behaviour before writing full E2E specs.
+- Integrate accessibility tooling (axe, Playwright audits) into the same debug workflow to catch regressions early.
+
+_Source: Murat DX blog posts, Playwright book appendix on debugging._
--- a/src/modules/bmm/testarch/nfr-assess.md
+++ b/src/modules/bmm/testarch/nfr-assess.md
@@ -1,38 +0,0 @@
-<!-- Powered by BMAD-CORE™ -->
-
-# NFR Assessment v2.0 (Slim)
-
-```xml
-<task id="bmad/bmm/testarch/nfr-assess" name="NFR Assessment">
-  <llm critical="true">
-    <i>Set command_key="*nfr-assess"</i>
-    <i>Load {project-root}/bmad/bmm/testarch/tea-commands.csv and parse the matching row</i>
-    <i>Load {project-root}/bmad/bmm/testarch/tea-knowledge.md focusing on NFR guidance</i>
-    <i>Use CSV columns preflight, flow_cues, deliverables, halt_rules, notes, knowledge_tags</i>
-    <i>Split pipe-delimited values into actionable lists</i>
-    <i>Demand evidence for each non-functional claim (tests, telemetry, logs)</i>
-  </llm>
-  <flow>
-    <step n="1" title="Preflight">
-      <action>Confirm prerequisites; halt per halt_rules if unmet</action>
-    </step>
-    <step n="2" title="Assess NFRs">
-      <action>Follow flow_cues to evaluate Security, Performance, Reliability, Maintainability</action>
-      <action>Use knowledge heuristics to suggest monitoring and fail-fast patterns</action>
-    </step>
-    <step n="3" title="Deliverables">
-      <action>Produce assessment document and recommendations defined in deliverables</action>
-      <action>Summarize status, gaps, and actions</action>
-    </step>
-  </flow>
-  <halt>
-    <i>Apply halt_rules from the CSV row</i>
-  </halt>
-  <notes>
-    <i>Reference notes column for negotiation framing (cost vs confidence)</i>
-  </notes>
-  <output>
-    <i>NFR assessment markdown with clear next steps</i>
-  </output>
-</task>
-```
--- a/src/modules/bmm/testarch/tea-commands.csv
+++ b/src/modules/bmm/testarch/tea-commands.csv
@@ -1,9 +0,0 @@
-command,title,when_to_use,preflight,flow_cues,deliverables,halt_rules,notes,knowledge_tags
-*automate,Automation expansion,After implementation or when reforging coverage,all acceptance criteria satisfied|code builds locally|framework configured,"Review story source/diff to confirm automation target; ensure fixture architecture exists (mergeTests for Playwright, commands for Cypress) and implement apiRequest/network/auth/log fixtures if missing; map acceptance criteria with test-levels-framework.md guidance and avoid duplicate coverage; assign priorities using test-priorities-matrix.md; generate unit/integration/E2E specs with naming convention feature-name.spec.ts, covering happy, negative, and edge paths; enforce deterministic waits, self-cleaning factories, and <=1.5 minute execution per test; run suite and capture Definition of Done results; update package.json scripts and README instructions",New or enhanced spec files grouped by level; fixture modules under support/; data factory utilities; updated package.json scripts and README notes; DoD summary with remaining gaps; gate-ready coverage summary,"If automation target unclear or framework missing, halt and request clarification",Never create page objects; keep tests <300 lines and stateless; forbid hard waits and conditional flow in tests; co-locate tests near source; flag flaky patterns immediately,philosophy/core|patterns/helpers|patterns/waits|patterns/dod
-*ci,CI/CD quality pipeline,Once automation suite exists or needs optimization,git repository initialized|tests pass locally|team agrees on target environments|access to CI platform settings,"Detect CI platform (default GitHub Actions, ask if GitLab/CircleCI/etc); scaffold workflow (.github/workflows/test.yml or platform equivalent) with triggers; set Node.js version from .nvmrc and cache node_modules + browsers; stage jobs: lint -> unit -> component -> e2e with matrix parallelization (shard by file not test); add selective execution script for affected tests; create burn-in job that reruns changed specs 3x to catch flakiness; attach artifacts on failure (traces/videos/HAR); configure retries/backoff and concurrency controls; document required secrets and environment variables; add Slack/email notifications and local script mirroring CI",.github/workflows/test.yml (or platform equivalent); scripts/test-changed.sh; scripts/burn-in-changed.sh; updated README/ci.md instructions; secrets checklist; dashboard or badge configuration,"If git repo absent, test framework missing, or CI platform unspecified, halt and request setup",Target 20x speedups via parallel shards + caching; shard by file; keep jobs under 10 minutes; wait-on-timeout 120s for app startup; ensure npm test locally matches CI run; mention alternative platform paths when not on GitHub,philosophy/core|ci-strategy
-*framework,Initialize test architecture,Run once per repo or when no production-ready harness exists,package.json present|no existing E2E framework detected|architectural context available,"Identify stack from package.json (React/Vue/Angular/Next.js); detect bundler (Vite/Webpack/Rollup/esbuild); match test language to source (JS/TS frontend -> JS/TS tests); choose Playwright for large or performance-critical repos, Cypress for small DX-first teams; create {framework}/tests/ and {framework}/support/fixtures/ and {framework}/support/helpers/; configure config files with timeouts (action 15s, navigation 30s, test 60s) and reporters (HTML + JUnit); create .env.example with TEST_ENV, BASE_URL, API_URL; implement pure function->fixture->mergeTests pattern and faker-based data factories; enable failure-only screenshots/videos and ensure .nvmrc recorded",playwright/ or cypress/ folder with config + support tree; .env.example; .nvmrc; example tests; README with setup instructions,"If package.json missing OR framework already configured, halt and instruct manual review","Playwright: worker parallelism, trace viewer, multi-language support; Cypress: avoid if many dependent API calls; Component testing: Vitest (large) or Cypress CT (small); Contract testing: Pact for microservices; always use data-cy/data-testid selectors",philosophy/core|patterns/fixtures|patterns/selectors
-*gate,Quality gate decision,After review or mitigation updates,latest assessments gathered|team consensus on fixes,"Assemble story metadata (id, title); choose gate status using deterministic rules (PASS all critical issues resolved, CONCERNS minor residual risk, FAIL critical blockers, WAIVED approved by business); update YAML schema with sections: metadata, waiver status, top_issues, risk_summary totals, recommendations (must_fix, monitor), nfr_validation statuses, history; capture rationale, owners, due dates, and summary comment back to story","docs/qa/gates/{story}.yml updated with schema fields (schema, story, story_title, gate, status_reason, reviewer, updated, waiver, top_issues, risk_summary, recommendations, nfr_validation, history); summary message for team","If review incomplete or risk data outdated, halt and request rerun","FAIL whenever unresolved P0 risks/tests or security holes remain; CONCERNS when mitigations planned but residual risk exists; WAIVED requires reason, approver, and expiry; maintain audit trail in history",philosophy/core|risk-model
-*nfr-assess,NFR validation,Late development or pre-review for critical stories,implementation deployed locally|non-functional goals defined or discoverable,"Ask which NFRs to assess; default to core four (security, performance, reliability, maintainability); gather thresholds from story/architecture/technical-preferences and mark unknown targets; inspect evidence (tests, telemetry, logs) for each NFR; classify status using deterministic pass/concerns/fail rules and list quick wins; produce gate block and assessment doc with recommended actions",NFR assessment markdown with findings; gate YAML block capturing statuses and notes; checklist of evidence gaps and follow-up owners,"If NFR targets undefined and no guidance available, request definition and halt","Unknown thresholds -> CONCERNS, never guess; ensure each NFR has evidence or call it out; suggest monitoring hooks and fail-fast mechanisms when gaps exist",philosophy/core|nfr
-*tdd,Acceptance Test Driven Development,Before implementation when team commits to TDD,story approved with acceptance criteria|dev sandbox ready|framework scaffolding in place,Clarify acceptance criteria and affected systems; pick appropriate test level (E2E/API/Component); write failing acceptance tests using Given-When-Then with network interception first then navigation; create data factories and fixture stubs for required entities; outline mocks/fixtures infrastructure the dev team must supply; generate component tests for critical UI logic; compile implementation checklist mapping each test to source work; share failing tests with dev agent and maintain red -> green -> refactor loop,Failing acceptance test files; component test stubs; fixture/mocks skeleton; implementation checklist with test-to-code mapping; documented data-testid requirements,"If criteria ambiguous or framework missing, halt for clarification",Start red; one assertion per test; use beforeEach for visible setup (no shared state); remind devs to run tests before writing production code; update checklist as each test goes green,philosophy/core|patterns/test-structure
-*test-design,Risk and test design planning,"After story approval, before development",story markdown present|acceptance criteria clear|architecture/PRD accessible,"Filter requirements so only genuine risks remain; review PRD/architecture/story for unresolved gaps; classify risks across TECH, SEC, PERF, DATA, BUS, OPS using category definitions; request clarification when evidence missing; score probability (1 unlikely, 2 possible, 3 likely) and impact (1 minor, 2 degraded, 3 critical) then compute totals; highlight risks >=6 and plan mitigations with owners and timelines; break acceptance criteria into atomic scenarios mapped to mitigations; reference test-levels-framework.md to pick unit/integration/E2E/component levels; avoid duplicate coverage, prefer lower levels when possible; assign priorities using test-priorities-matrix.md; outline data/tooling prerequisites and execution order",Risk assessment markdown in docs/qa/assessments; table of category/probability/impact/score; mitigation matrix with owners and due dates; coverage matrix with requirement/level/priority/mitigation; gate YAML snippet summarizing risk totals and scenario counts; recommended execution order,"If story missing or criteria unclear, halt for clarification","Category definitions: TECH=architecture flaws; SEC=missing controls/vulnerabilities; PERF=SLA risk; DATA=loss/corruption; BUS=user/business harm; OPS=deployment/run failures; rely on evidence, not speculation; tie scenarios back to risk mitigations; keep scenarios independent and maintainable",philosophy/core|risk-model|patterns/test-structure
-*trace,Requirements traceability,Mid-development checkpoint or before review,tests exist for story|access to source + specs,"Gather acceptance criteria and implemented tests; map each criterion to concrete tests (file + describe/it) using Given-When-Then narrative; classify coverage status as FULL, PARTIAL, NONE, UNIT-ONLY, INTEGRATION-ONLY; flag severity based on priority (P0 gaps critical); recommend additional tests or refactors; generate gate YAML coverage summary",Traceability report saved under docs/qa/assessments; coverage matrix with status per criterion; gate YAML snippet for coverage totals and gaps,"If story lacks implemented tests, pause and advise running *tdd or writing tests","Definitions: FULL=all scenarios validated, PARTIAL=some coverage exists, NONE=no validation, UNIT-ONLY=missing higher level, INTEGRATION-ONLY=lacks lower confidence; ensure assertions explicit and avoid duplicate coverage",philosophy/core|patterns/assertions
--- a/src/modules/bmm/testarch/tea-index.csv
+++ b/src/modules/bmm/testarch/tea-index.csv
@@ -0,0 +1,19 @@
+id,name,description,tags,fragment_file
+fixture-architecture,Fixture Architecture,"Composable fixture patterns (pure function → fixture → merge) and reuse rules","fixtures,architecture,playwright,cypress",knowledge/fixture-architecture.md
+network-first,Network-First Safeguards,"Intercept-before-navigate workflow, HAR capture, deterministic waits, edge mocking","network,stability,playwright,cypress",knowledge/network-first.md
+data-factories,Data Factories and API Setup,"Factories with overrides, API seeding, cleanup discipline","data,factories,setup,api",knowledge/data-factories.md
+component-tdd,Component TDD Loop,"Red→green→refactor workflow, provider isolation, accessibility assertions","component-testing,tdd,ui",knowledge/component-tdd.md
+playwright-config,Playwright Config Guardrails,"Environment switching, timeout standards, artifact outputs","playwright,config,env",knowledge/playwright-config.md
+ci-burn-in,CI and Burn-In Strategy,"Staged jobs, shard orchestration, burn-in loops, artifact policy","ci,automation,flakiness",knowledge/ci-burn-in.md
+selective-testing,Selective Test Execution,"Tag/grep usage, spec filters, diff-based runs, promotion rules","risk-based,selection,strategy",knowledge/selective-testing.md
+feature-flags,Feature Flag Governance,"Enum management, targeting helpers, cleanup, release checklists","feature-flags,governance,launchdarkly",knowledge/feature-flags.md
+contract-testing,Contract Testing Essentials,"Pact publishing, provider verification, resilience coverage","contract-testing,pact,api",knowledge/contract-testing.md
+email-auth,Email Authentication Testing,"Magic link extraction, state preservation, caching, negative flows","email-authentication,security,workflow",knowledge/email-auth.md
+error-handling,Error Handling Checks,"Scoped exception handling, retry validation, telemetry logging","resilience,error-handling,stability",knowledge/error-handling.md
+visual-debugging,Visual Debugging Toolkit,"Trace viewer usage, artifact expectations, accessibility integration","debugging,dx,tooling",knowledge/visual-debugging.md
+risk-governance,Risk Governance,"Scoring matrix, category ownership, gate decision rules","risk,governance,gates",knowledge/risk-governance.md
+probability-impact,Probability and Impact Scale,"Shared definitions for scoring matrix and gate thresholds","risk,scoring,scale",knowledge/probability-impact.md
+test-quality,Test Quality Definition of Done,"Execution limits, isolation rules, green criteria","quality,definition-of-done,tests",knowledge/test-quality.md
+nfr-criteria,NFR Review Criteria,"Security, performance, reliability, maintainability status definitions","nfr,assessment,quality",knowledge/nfr-criteria.md
+test-levels,Test Levels Framework,"Guidelines for choosing unit, integration, or end-to-end coverage","testing,levels,selection",knowledge/test-levels-framework.md
+test-priorities,Test Priorities Matrix,"P0–P3 criteria, coverage targets, execution ordering","testing,prioritization,risk",knowledge/test-priorities-matrix.md
--- a/src/modules/bmm/testarch/tea-knowledge.md
+++ b/src/modules/bmm/testarch/tea-knowledge.md
@@ -1,275 +0,0 @@
-<!-- Powered by BMAD-CORE™ -->
-
-# Murat Test Architecture Foundations (Slim Brief)
-
-This brief distills Murat Ozcan's testing philosophy used by the Test Architect agent. Use it as the north star after loading `tea-commands.csv`.
-
-## Core Principles
-
- Cost vs confidence: cost = creation + execution + maintenance. Push confidence where impact is highest and skip redundant checks.
- Engineering assumes failure: predict what breaks, defend with tests, learn from every failure. A single failing test means the software is not ready.
- Quality is team work. Story estimates include testing, documentation, and deployment work required to ship safely.
- Missing test coverage is feature debt (hurts customers), not mere tech debt—treat it with the same urgency as functionality gaps.
- Shared mutable state is the source of all evil: design fixtures and helpers so each test owns its data.
- Composition over inheritance: prefer functional helpers and fixtures that compose behaviour; page objects and deep class trees hide duplication.
- Setup via API, assert via UI. Keep tests user-centric while priming state through fast interfaces.
- One test = one concern. Explicit assertions live in the test body, not buried in helpers.
-
-## Patterns and Heuristics
-
- Selector order: `data-cy` / `data-testid` -> ARIA -> text. Avoid brittle CSS, IDs, or index based locators.
- Network boundary is the mock boundary. Stub at the edge, never mid-service unless risk demands.
- **Network-first pattern**: ALWAYS intercept before navigation: `const call = interceptNetwork(); await page.goto(); await call;`
- Deterministic waits only: await specific network responses, elements disappearing, or event hooks. Ban fixed sleeps.
- **Fixture architecture (The Murat Way)**:
-  ```typescript
-  // 1. Pure function first (testable independently)
-  export async function apiRequest({ request, method, url, data }) {
-    /* implementation */
-  }
-  // 2. Fixture wrapper
-  export const apiRequestFixture = base.extend({
-    apiRequest: async ({ request }, use) => {
-      await use((params) => apiRequest({ request, ...params }));
-    },
-  });
-  // 3. Compose via mergeTests
-  export const test = mergeTests(base, apiRequestFixture, authFixture, networkFixture);
-  ```
- **Data factories pattern**:
-  ```typescript
-  export const createUser = (overrides = {}) => ({
-    id: faker.string.uuid(),
-    email: faker.internet.email(),
-    ...overrides,
-  });
-  ```
- Visual debugging: keep component/test runner UIs available (Playwright trace viewer, Cypress runner) to accelerate feedback.
-
-## Risk and Coverage
-
- Risk score = probability (1-3) × impact (1-3). Score 9 => gate FAIL, ≥6 => CONCERNS. Most stories have 0-1 high risks.
- Test level ratio: heavy unit/component coverage, but always include E2E for critical journeys and integration seams.
- Traceability looks for reality: map each acceptance criterion to concrete tests and flag missing coverage or duplicate value.
- NFR focus areas: Security, Performance, Reliability, Maintainability. Demand evidence (tests, telemetry, alerts) before approving.
-
-## Test Configuration
-
- **Timeouts**: actionTimeout 15s, navigationTimeout 30s, testTimeout 60s, expectTimeout 10s
- **Reporters**: HTML (never auto-open) + JUnit XML for CI integration
- **Media**: screenshot only-on-failure, video retain-on-failure
- **Language Matching**: Tests should match source code language (JS/TS frontend -> JS/TS tests)
-
-## Automation and CI
-
- Prefer Playwright for multi-language teams, worker parallelism, rich debugging; Cypress suits smaller DX-first repos or component-heavy spikes.
- **Framework Selection**: Large repo + performance = Playwright, Small repo + DX = Cypress
- **Component Testing**: Large repos = Vitest (has UI, easy RTL conversion), Small repos = Cypress CT
- CI pipelines run lint -> unit -> component -> e2e, with selective reruns for flakes and artifacts (videos, traces) on failure.
- Shard suites to keep feedback tight; treat CI as shared safety net, not a bottleneck.
- Test selection ideas (32+ strategies): filter by tags/grep (`npm run test -- --grep "@smoke"`), file patterns (`--spec "**/*checkout*"`), changed files (`npm run test:changed`), or test level (`npm run test:unit` / `npm run test:e2e`).
- Burn-in testing: run new or changed specs multiple times (e.g., 3-10x) to flush flakes before they land in main.
- Keep helper scripts handy (`scripts/test-changed.sh`, `scripts/burn-in-changed.sh`) so CI and local workflows stay in sync.
-
-## Project Structure and Config
-
- **Directory structure**:
-  ```
-  project/
-  ├── playwright.config.ts     # Environment-based config loading
-  ├── playwright/
-  │   ├── tests/               # All specs (group by domain: auth/, network/, feature-flags/…)
-  │   ├── support/             # Frequently touched helpers (global-setup, merged-fixtures, ui helpers, factories)
-  │   ├── config/              # Environment configs (base, local, staging, production)
-  │   └── scripts/             # Expert utilities (burn-in, record/playback, maintenance)
-  ```
- **Environment config pattern**:
-  ```javascript
-  const configs = {
-    local: require('./config/local.config'),
-    staging: require('./config/staging.config'),
-    prod: require('./config/prod.config'),
-  };
-  export default configs[process.env.TEST_ENV || 'local'];
-  ```
-
-## Test Hygiene and Independence
-
- Tests must be independent and stateless; never rely on execution order.
- Cleanup all data created during tests (afterEach or API cleanup).
- Ensure idempotency: same results every run.
- No shared mutable state; prefer factory functions per test.
- Tests must run in parallel safely; never commit `.only`.
- Prefer co-location: component tests next to components, integration in `tests/integration`, etc.
- Feature flags: centralise enum definitions (e.g., `export const FLAGS = Object.freeze({ NEW_FEATURE: 'new-feature' })`), provide helpers to set/clear targeting, and write dedicated flag tests that clean up targeting after each run.
-
-## CCTDD (Component Test-Driven Development)
-
- Start with failing component test -> implement minimal component -> refactor.
- Component tests catch ~70% of bugs before integration.
- Use `cy.mount()` or `render()` to test components in isolation; focus on user interactions.
-
-## CI Optimization Strategies
-
- **Parallel execution**: Split by test file, not test case.
- **Smart selection**: Run only tests affected by changes (dependency graphs, git diff).
- **Burn-in testing**: Run new/modified tests 3x to catch flakiness early.
- **HAR recording**: Record network traffic for offline playback in CI.
- **Selective reruns**: Only rerun failed specs, not entire suite.
- **Network recording**: capture HAR files during stable runs so CI can replay network traffic when external systems are flaky.
-
-## Package Scripts
-
- **Essential npm scripts**:
-  ```json
-  "test:e2e": "playwright test",
-  "test:unit": "vitest run",
-  "test:component": "cypress run --component",
-  "test:contract": "jest --testMatch='**/pact/*.spec.ts'",
-  "test:debug": "playwright test --headed",
-  "test:ci": "npm run test:unit andand npm run test:e2e",
-  "contract:publish": "pact-broker publish"
-  ```
-
-## Contract Testing (Pact)
-
- Use for microservices with integration points.
- Consumer generates contracts, provider verifies.
- Structure: `pact/` directory at root, `pact/config.ts` for broker settings.
- Reference repos: pact-js-example-consumer, pact-js-example-provider, pact-js-example-react-consumer.
-
-## Online Resources and Examples
-
- Fixture architecture: https://github.com/muratkeremozcan/cy-vs-pw-murats-version
- Playwright patterns: https://github.com/muratkeremozcan/pw-book
- Component testing (CCTDD): https://github.com/muratkeremozcan/cctdd
- Contract testing: https://github.com/muratkeremozcan/pact-js-example-consumer
- Full app example: https://github.com/muratkeremozcan/tour-of-heroes-react-vite-cypress-ts
- Blog posts: https://dev.to/muratkeremozcan
-
-## Risk Model Details
-
- TECH: Unmitigated architecture flaws, experimental patterns without fallbacks.
- SEC: Missing security controls, potential vulnerabilities, unsafe data handling.
- PERF: SLA-breaking slowdowns, resource exhaustion, lack of caching.
- DATA: Loss or corruption scenarios, migrations without rollback, inconsistent schemas.
- BUS: Business or user harm, revenue-impacting failures, compliance gaps.
- OPS: Deployment, infrastructure, or observability gaps that block releases.
-
-## Probability and Impact Scale
-
- Probability 1 = Unlikely (standard implementation, low risk).
- Probability 2 = Possible (edge cases, needs attention).
- Probability 3 = Likely (known issues, high uncertainty).
- Impact 1 = Minor (cosmetic, easy workaround).
- Impact 2 = Degraded (partial feature loss, manual workaround needed).
- Impact 3 = Critical (blocker, data/security/regulatory impact).
- Scores: 9 => FAIL, 6-8 => CONCERNS, 4 => monitor, 1-3 => note only.
-
-## Test Design Frameworks
-
- Use `docs/docs-v6/v6-bmm/test-levels-framework.md` for level selection and anti-patterns.
- Use `docs/docs-v6/v6-bmm/test-priorities-matrix.md` for P0-P3 priority criteria.
- Naming convention: `{epic}.{story}-{LEVEL}-{sequence}` (e.g., `2.4-E2E-01`).
- Tie each scenario to risk mitigations or acceptance criteria.
-
-## Test Quality Definition of Done
-
- No hard waits (`page.waitForTimeout`, `cy.wait(ms)`)—use deterministic waits.
- Each test < 300 lines and executes in <= 1.5 minutes.
- Tests are stateless, parallel-safe, and self-cleaning.
- No conditional logic in tests (`if/else`, `try/catch` controlling flow).
- Explicit assertions live in tests, not hidden in helpers.
- Tests must run green locally and in CI with identical commands.
- A test delivers value only when it has failed at least once—design suites so they regularly catch regressions during development.
-
-## NFR Status Criteria
-
- **Security**: PASS (auth, authz, secrets handled), CONCERNS (minor gaps), FAIL (critical exposure).
- **Performance**: PASS (meets targets, profiling evidence), CONCERNS (approaching limits), FAIL (breaches limits, leaks).
- **Reliability**: PASS (error handling, retries, health checks), CONCERNS (partial coverage), FAIL (no recovery, crashes).
- **Maintainability**: PASS (tests + docs + clean code), CONCERNS (duplication, low coverage), FAIL (no tests, tangled code).
- Unknown targets => CONCERNS until defined.
-
-## Quality Gate Schema
-
-```yaml
-schema: 1
-story: '{epic}.{story}'
-story_title: '{title}'
-gate: PASS|CONCERNS|FAIL|WAIVED
-status_reason: 'Single sentence summary'
-reviewer: 'Murat (Master Test Architect)'
-updated: '2024-09-20T12:34:56Z'
-waiver:
-  active: false
-  reason: ''
-  approved_by: ''
-  expires: ''
-top_issues:
-  - id: SEC-001
-    severity: high
-    finding: 'Issue description'
-    suggested_action: 'Action to resolve'
-risk_summary:
-  totals:
-    critical: 0
-    high: 0
-    medium: 0
-    low: 0
-recommendations:
-  must_fix: []
-  monitor: []
-nfr_validation:
-  security: { status: PASS, notes: '' }
-  performance: { status: CONCERNS, notes: 'Add caching' }
-  reliability: { status: PASS, notes: '' }
-  maintainability: { status: PASS, notes: '' }
-history:
-  - at: '2024-09-20T12:34:56Z'
-    gate: CONCERNS
-    note: 'Initial review'
-```
-
- Optional sections: `quality_score` block for extended metrics, and `evidence` block (tests_reviewed, risks_identified, trace.ac_covered/ac_gaps) when teams track them.
-
-## Collaborative TDD Loop
-
- Share failing acceptance tests with the developer or AI agent.
- Track red -> green -> refactor progress alongside the implementation checklist.
- Update checklist items as each test passes; add new tests for discovered edge cases.
- Keep conversation focused on observable behavior, not implementation detail.
-
-## Traceability Coverage Definitions
-
- FULL: All scenarios for the criterion validated across appropriate levels.
- PARTIAL: Some coverage exists but gaps remain.
- NONE: No tests currently validate the criterion.
- UNIT-ONLY: Only low-level tests exist; add integration/E2E.
- INTEGRATION-ONLY: Missing unit/component coverage for fast feedback.
- Avoid naive UI E2E until service-level confidence exists; use API or contract tests to harden backends first, then add minimal UI coverage to fill the gaps.
-
-## CI Platform Guidance
-
- Default to GitHub Actions if no preference is given; otherwise ask for GitLab, CircleCI, etc.
- Ensure local script mirrors CI pipeline (npm test vs CI workflow).
- Use concurrency controls to prevent duplicate runs (`concurrency` block in GitHub Actions).
- Keep job runtime under 10 minutes; split further if necessary.
-
-## Testing Tool Preferences
-
- Component testing: Large repositories prioritize Vitest with UI (fast, component-native). Smaller DX-first teams with existing Cypress stacks can keep Cypress Component Testing for consistency.
- E2E testing: Favor Playwright for large or performance-sensitive repos; reserve Cypress for smaller DX-first teams where developer experience outweighs scale.
- API testing: Prefer Playwright's API testing or contract suites over ad-hoc REST clients.
- Contract testing: Pact.js for consumer-driven contracts; keep `pact/` config in repo.
- Visual testing: Percy, Chromatic, or Playwright snapshots when UX must be audited.
-
-## Naming Conventions
-
- File names: `ComponentName.cy.tsx` for Cypress component tests, `component-name.spec.ts` for Playwright, `ComponentName.test.tsx` for unit/RTL.
- Describe blocks: `describe('Feature/Component Name', () => { context('when condition', ...) })`.
- Data attributes: always kebab-case (`data-cy="submit-button"`, `data-testid="user-email"`).
-
-## Reference Materials
-
-If deeper context is needed, consult Murat's testing philosophy notes, blog posts, and sample repositories in https://github.com/muratkeremozcan/test-resources-for-ai/blob/main/gitingest-full-repo-text-version.txt.
--- a/src/modules/bmm/testarch/test-design.md
+++ b/src/modules/bmm/testarch/test-design.md
@@ -1,43 +0,0 @@
-<!-- Powered by BMAD-CORE™ -->
-
-# Risk and Test Design v3.0 (Slim)
-
-```xml
-<task id="bmad/bmm/testarch/test-design" name="Risk andamp; Test Design">
-  <llm critical="true">
-    <i>Set command_key="*test-design"</i>
-    <i>Load {project-root}/bmad/bmm/testarch/tea-commands.csv and parse the matching row</i>
-    <i>Load {project-root}/bmad/bmm/testarch/tea-knowledge.md for risk-model and coverage heuristics</i>
-    <i>Use CSV columns preflight, flow_cues, deliverables, halt_rules, notes, knowledge_tags as the execution blueprint</i>
-    <i>Split pipe-delimited values into actionable checklists</i>
-    <i>Stay evidence-based—link risks and scenarios directly to PRD/architecture/story artifacts</i>
-  </llm>
-  <flow>
-    <step n="1" title="Preflight">
-      <action>Confirm story markdown, acceptance criteria, and architecture/PRD access.</action>
-      <action>Stop immediately if halt_rules trigger (missing inputs or unclear requirements).</action>
-    </step>
-    <step n="2" title="Assess Risks">
-      <action>Follow flow_cues to filter genuine risks, classify them (TECH/SEC/PERF/DATA/BUS/OPS), and score probability × impact.</action>
-      <action>Document mitigations with owners, timelines, and residual risk expectations.</action>
-    </step>
-    <step n="3" title="Design Coverage">
-      <action>Break acceptance criteria into atomic scenarios mapped to mitigations.</action>
-      <action>Choose test levels using test-levels-framework.md, assign priorities via test-priorities-matrix.md, and note tooling/data prerequisites.</action>
-    </step>
-    <step n="4" title="Deliverables">
-      <action>Generate the combined risk report and test design artifacts described in deliverables.</action>
-      <action>Summarize key risks, mitigations, coverage plan, and recommended execution order.</action>
-    </step>
-  </flow>
-  <halt>
-    <i>Apply halt_rules from the CSV row verbatim.</i>
-  </halt>
-  <notes>
-    <i>Use notes column for calibration reminders and coverage heuristics.</i>
-  </notes>
-  <output>
-    <i>Unified risk assessment plus coverage strategy ready for implementation.</i>
-  </output>
-</task>
-```
--- a/src/modules/bmm/testarch/trace-requirements.md
+++ b/src/modules/bmm/testarch/trace-requirements.md
@@ -1,38 +0,0 @@
-<!-- Powered by BMAD-CORE™ -->
-
-# Requirements Traceability v2.0 (Slim)
-
-```xml
-<task id="bmad/bmm/testarch/trace" name="Requirements Traceability">
-  <llm critical="true">
-    <i>Set command_key="*trace"</i>
-    <i>Load {project-root}/bmad/bmm/testarch/tea-commands.csv and read the matching row</i>
-    <i>Load {project-root}/bmad/bmm/testarch/tea-knowledge.md emphasising assertions guidance</i>
-    <i>Use CSV columns preflight, flow_cues, deliverables, halt_rules, notes, knowledge_tags</i>
-    <i>Split pipe-delimited values into actionable lists</i>
-    <i>Focus on mapping reality: reference actual files, describe coverage gaps, recommend next steps</i>
-  </llm>
-  <flow>
-    <step n="1" title="Preflight">
-      <action>Validate prerequisites; halt per halt_rules if unmet</action>
-    </step>
-    <step n="2" title="Traceability Analysis">
-      <action>Follow flow_cues to map acceptance criteria to implemented tests</action>
-      <action>Leverage knowledge heuristics to highlight assertion quality and duplication risks</action>
-    </step>
-    <step n="3" title="Deliverables">
-      <action>Create traceability report described in deliverables</action>
-      <action>Summarize critical gaps and recommendations</action>
-    </step>
-  </flow>
-  <halt>
-    <i>Apply halt_rules from the CSV row</i>
-  </halt>
-  <notes>
-    <i>Reference notes column for additional emphasis</i>
-  </notes>
-  <output>
-    <i>Coverage matrix and narrative summary</i>
-  </output>
-</task>
-```