feat: migrate test architect entirely to v6

2025-12-29 16:14:59 +00:00 · 2025-10-14 16:10:20 -05:00
parent 4b6f34dff8
commit 93e290bf73
48 changed files with 16592 additions and 389 deletions
--- a/src/modules/bmm/workflows/testarch/test-design/README.md
+++ b/src/modules/bmm/workflows/testarch/test-design/README.md
@@ -0,0 +1,378 @@
+# Test Design and Risk Assessment Workflow
+
+Plans comprehensive test coverage strategy with risk assessment (probability × impact scoring), priority classification (P0-P3), and resource estimation. This workflow generates a test design document that identifies high-risk areas, maps requirements to appropriate test levels, and provides execution ordering for optimal feedback.
+
+## Usage
+
+```bash
+bmad tea *test-design
+```
+
+The TEA agent runs this workflow when:
+
+- Planning test coverage before development starts
+- Assessing risks for an epic or story
+- Prioritizing test scenarios by business impact
+- Estimating testing effort and resources
+
+## Inputs
+
+**Required Context Files:**
+
+- **Story markdown**: Acceptance criteria and requirements
+- **PRD or epics.md**: High-level product context
+- **Architecture docs** (optional): Technical constraints and integration points
+
+**Workflow Variables:**
+
+- `epic_num`: Epic number for scoped design
+- `story_path`: Specific story for design (optional)
+- `design_level`: full/targeted/minimal (default: full)
+- `risk_threshold`: Score for high-priority flag (default: 6)
+- `risk_categories`: TECH,SEC,PERF,DATA,BUS,OPS (all enabled)
+- `priority_levels`: P0,P1,P2,P3 (all enabled)
+
+## Outputs
+
+**Primary Deliverable:**
+
+**Test Design Document** (`test-design-epic-{N}.md`):
+
+1. **Risk Assessment Matrix**
+   - Risk ID, category, description
+   - Probability (1-3) × Impact (1-3) = Score
+   - Scores ≥6 flagged as high-priority
+   - Mitigation plans with owners and timelines
+
+2. **Coverage Matrix**
+   - Requirement → Test Level (E2E/API/Component/Unit)
+   - Priority assignment (P0-P3)
+   - Risk linkage
+   - Test count estimates
+
+3. **Execution Order**
+   - Smoke tests (P0 subset, <5 min)
+   - P0 tests (critical paths, <10 min)
+   - P1 tests (important features, <30 min)
+   - P2/P3 tests (full regression, <60 min)
+
+4. **Resource Estimates**
+   - Hours per priority level
+   - Total effort in days
+   - Tooling and data prerequisites
+
+5. **Quality Gate Criteria**
+   - P0 pass rate: 100%
+   - P1 pass rate: ≥95%
+   - High-risk mitigations: 100%
+   - Coverage target: ≥80%
+
+## Key Features
+
+### Risk Scoring Framework
+
+**Probability × Impact = Risk Score**
+
+**Probability** (1-3):
+
+- 1 (Unlikely): <10% chance
+- 2 (Possible): 10-50% chance
+- 3 (Likely): >50% chance
+
+**Impact** (1-3):
+
+- 1 (Minor): Cosmetic, workaround exists
+- 2 (Degraded): Feature impaired, difficult workaround
+- 3 (Critical): System failure, no workaround
+
+**Scores**:
+
+- 1-2: Low risk (monitor)
+- 3-4: Medium risk (plan mitigation)
+- **6-9: High risk** (immediate mitigation required)
+
+### Risk Categories (6 types)
+
+**TECH** (Technical/Architecture):
+
+- Architecture flaws, integration failures
+- Scalability issues, technical debt
+
+**SEC** (Security):
+
+- Missing access controls, auth bypass
+- Data exposure, injection vulnerabilities
+
+**PERF** (Performance):
+
+- SLA violations, response time degradation
+- Resource exhaustion, scalability limits
+
+**DATA** (Data Integrity):
+
+- Data loss/corruption, inconsistent state
+- Migration failures
+
+**BUS** (Business Impact):
+
+- UX degradation, business logic errors
+- Revenue impact, compliance violations
+
+**OPS** (Operations):
+
+- Deployment failures, configuration errors
+- Monitoring gaps, rollback issues
+
+### Priority Classification (P0-P3)
+
+**P0 (Critical)** - Run on every commit:
+
+- Blocks core user journey
+- High-risk (score ≥6)
+- Revenue-impacting or security-critical
+
+**P1 (High)** - Run on PR to main:
+
+- Important user features
+- Medium-risk (score 3-4)
+- Common workflows
+
+**P2 (Medium)** - Run nightly/weekly:
+
+- Secondary features
+- Low-risk (score 1-2)
+- Edge cases
+
+**P3 (Low)** - Run on-demand:
+
+- Nice-to-have, exploratory
+- Performance benchmarks
+
+### Test Level Selection
+
+**E2E (End-to-End)**:
+
+- Critical user journeys
+- Multi-system integration
+- Highest confidence, slowest
+
+**API (Integration)**:
+
+- Service contracts
+- Business logic validation
+- Fast feedback, stable
+
+**Component**:
+
+- UI component behavior
+- Visual regression
+- Fast, isolated
+
+**Unit**:
+
+- Business logic, edge cases
+- Error handling
+- Fastest, most granular
+
+**Key principle**: Avoid duplicate coverage - don't test same behavior at multiple levels.
+
+### Knowledge Base Integration
+
+Automatically consults TEA knowledge base:
+
+- `risk-governance.md` - Risk classification framework
+- `probability-impact.md` - Risk scoring methodology
+- `test-levels-framework.md` - Test level selection
+- `test-priorities-matrix.md` - P0-P3 prioritization
+
+## Integration with Other Workflows
+
+**Before test-design:**
+
+- **plan-project** (Phase 2): Creates PRD and epics
+- **solution-architecture** (Phase 3): Defines technical approach
+- **tech-spec** (Phase 3): Implementation details
+
+**After test-design:**
+
+- **atdd**: Generate failing tests for P0 scenarios
+- **automate**: Expand coverage for P1/P2 scenarios
+- **gate**: Use quality gate criteria for release decisions
+
+**Coordinates with:**
+
+- **framework**: Test infrastructure must exist
+- **ci**: Execution order maps to CI stages
+
+**Updates:**
+
+- `bmm-workflow-status.md`: Adds test design to Quality & Testing Progress
+
+## Important Notes
+
+### Evidence-Based Assessment
+
+**Critical principle**: Base risk assessment on **evidence**, not speculation.
+
+**Evidence sources:**
+
+- PRD and user research
+- Architecture documentation
+- Historical bug data
+- User feedback
+- Security audit results
+
+**When uncertain**: Document assumptions, request user clarification.
+
+**Avoid**:
+
+- Guessing business impact
+- Assuming user behavior
+- Inventing requirements
+
+### Resource Estimation Formula
+
+```
+P0: 2 hours per test (setup + complex scenarios)
+P1: 1 hour per test (standard coverage)
+P2: 0.5 hours per test (simple scenarios)
+P3: 0.25 hours per test (exploratory)
+
+Total Days = Total Hours / 8
+```
+
+Example:
+
+- 15 P0 × 2h = 30h
+- 25 P1 × 1h = 25h
+- 40 P2 × 0.5h = 20h
+- **Total: 75 hours (~10 days)**
+
+### Execution Order Strategy
+
+**Smoke tests** (subset of P0, <5 min):
+
+- Login successful
+- Dashboard loads
+- Core API responds
+
+**Purpose**: Fast feedback, catch build-breaking issues immediately.
+
+**P0 tests** (critical paths, <10 min):
+
+- All scenarios blocking user journeys
+- Security-critical flows
+
+**P1 tests** (important features, <30 min):
+
+- Common workflows
+- Medium-risk areas
+
+**P2/P3 tests** (full regression, <60 min):
+
+- Edge cases
+- Performance benchmarks
+
+### Quality Gate Criteria
+
+**Pass/Fail thresholds:**
+
+- P0: 100% pass (no exceptions)
+- P1: ≥95% pass (2-3 failures acceptable with waivers)
+- P2/P3: ≥90% pass (informational)
+- High-risk items: All mitigated or have approved waivers
+
+**Coverage targets:**
+
+- Critical paths: ≥80%
+- Security scenarios: 100%
+- Business logic: ≥70%
+
+## Validation Checklist
+
+After workflow completion:
+
+- [ ] Risk assessment complete (all categories)
+- [ ] Risks scored (probability × impact)
+- [ ] High-priority risks (≥6) flagged
+- [ ] Coverage matrix maps requirements to test levels
+- [ ] Priorities assigned (P0-P3)
+- [ ] Execution order defined
+- [ ] Resource estimates provided
+- [ ] Quality gate criteria defined
+- [ ] Output file created
+
+Refer to `checklist.md` for comprehensive validation.
+
+## Example Execution
+
+**Scenario: E-commerce checkout epic**
+
+```bash
+bmad tea *test-design
+# Epic 3: Checkout flow redesign
+
+# Risk Assessment identifies:
+- R-001 (SEC): Payment bypass, P=2 × I=3 = 6 (HIGH)
+- R-002 (PERF): Cart load time, P=3 × I=2 = 6 (HIGH)
+- R-003 (BUS): Order confirmation email, P=2 × I=2 = 4 (MEDIUM)
+
+# Coverage Plan:
+P0 scenarios: 12 tests (payment security, order creation)
+P1 scenarios: 18 tests (cart management, promo codes)
+P2 scenarios: 25 tests (edge cases, error handling)
+
+Total effort: 65 hours (~8 days)
+
+# Test Levels:
+- E2E: 8 tests (critical checkout path)
+- API: 30 tests (business logic, payment processing)
+- Unit: 17 tests (calculations, validations)
+
+# Execution Order:
+1. Smoke: Payment successful, order created (2 min)
+2. P0: All payment & security flows (8 min)
+3. P1: Cart & promo codes (20 min)
+4. P2: Edge cases (40 min)
+
+# Quality Gates:
+- P0 pass rate: 100%
+- P1 pass rate: ≥95%
+- R-001 mitigated: Add payment validation layer
+- R-002 mitigated: Implement cart caching
+```
+
+## Troubleshooting
+
+**Issue: "Unable to score risks - missing context"**
+
+- **Cause**: Insufficient documentation
+- **Solution**: Request PRD, architecture docs, or user clarification
+
+**Issue: "All tests marked as P0"**
+
+- **Cause**: Over-prioritization
+- **Solution**: Apply strict P0 criteria (blocks core journey + high risk + no workaround)
+
+**Issue: "Duplicate coverage at multiple test levels"**
+
+- **Cause**: Not following test pyramid
+- **Solution**: Use E2E for critical paths only, API for logic, unit for edge cases
+
+**Issue: "Resource estimates too high"**
+
+- **Cause**: Complex test setup or insufficient automation
+- **Solution**: Invest in fixtures/factories upfront, reduce per-test setup time
+
+## Related Workflows
+
+- **atdd**: Generate failing tests → [atdd/README.md](../atdd/README.md)
+- **automate**: Expand regression coverage → [automate/README.md](../automate/README.md)
+- **gate**: Quality gate decisions → [gate/README.md](../gate/README.md)
+- **framework**: Test infrastructure → [framework/README.md](../framework/README.md)
+
+## Version History
+
+- **v4.0 (BMad v6)**: Pure markdown instructions, risk scoring framework, template-based output
+- **v3.x**: XML format instructions
+- **v2.x**: Legacy task-based approach