src/modules/bmm/workflows/testarch/test-design/README.md

# Test Design and Risk Assessment Workflow

Plans comprehensive test coverage strategy with risk assessment (probability × impact scoring), priority classification (P0-P3), and resource estimation. This workflow generates a test design document that identifies high-risk areas, maps requirements to appropriate test levels, and provides execution ordering for optimal feedback.

## Usage

```bash
bmad tea *test-design
```

The TEA agent runs this workflow when:

- Planning test coverage before development starts
- Assessing risks for an epic or story
- Prioritizing test scenarios by business impact
- Estimating testing effort and resources

## Inputs

**Required Context Files:**

- **Story markdown**: Acceptance criteria and requirements
- **PRD or epics.md**: High-level product context
- **Architecture docs** (optional): Technical constraints and integration points

**Workflow Variables:**

- `epic_num`: Epic number for scoped design
- `story_path`: Specific story for design (optional)
- `design_level`: full/targeted/minimal (default: full)
- `risk_threshold`: Score for high-priority flag (default: 6)
- `risk_categories`: TECH,SEC,PERF,DATA,BUS,OPS (all enabled)
- `priority_levels`: P0,P1,P2,P3 (all enabled)

## Outputs

**Primary Deliverable:**

**Test Design Document** (`test-design-epic-{N}.md`):

1. **Risk Assessment Matrix**
   - Risk ID, category, description
   - Probability (1-3) × Impact (1-3) = Score
   - Scores ≥6 flagged as high-priority
   - Mitigation plans with owners and timelines

2. **Coverage Matrix**
   - Requirement → Test Level (E2E/API/Component/Unit)
   - Priority assignment (P0-P3)
   - Risk linkage
   - Test count estimates

3. **Execution Order**
   - Smoke tests (P0 subset, <5 min)
   - P0 tests (critical paths, <10 min)
   - P1 tests (important features, <30 min)
   - P2/P3 tests (full regression, <60 min)

4. **Resource Estimates**
   - Hours per priority level
   - Total effort in days
   - Tooling and data prerequisites

5. **Quality Gate Criteria**
   - P0 pass rate: 100%
   - P1 pass rate: ≥95%
   - High-risk mitigations: 100%
   - Coverage target: ≥80%

## Key Features

### Risk Scoring Framework

**Probability × Impact = Risk Score**

**Probability** (1-3):

- 1 (Unlikely): <10% chance
- 2 (Possible): 10-50% chance
- 3 (Likely): >50% chance

**Impact** (1-3):

- 1 (Minor): Cosmetic, workaround exists
- 2 (Degraded): Feature impaired, difficult workaround
- 3 (Critical): System failure, no workaround

**Scores**:

- 1-2: Low risk (monitor)
- 3-4: Medium risk (plan mitigation)
- **6-9: High risk** (immediate mitigation required)

### Risk Categories (6 types)

**TECH** (Technical/Architecture):

- Architecture flaws, integration failures
- Scalability issues, technical debt

**SEC** (Security):

- Missing access controls, auth bypass
- Data exposure, injection vulnerabilities

**PERF** (Performance):

- SLA violations, response time degradation
- Resource exhaustion, scalability limits

**DATA** (Data Integrity):

- Data loss/corruption, inconsistent state
- Migration failures

**BUS** (Business Impact):

- UX degradation, business logic errors
- Revenue impact, compliance violations

**OPS** (Operations):

- Deployment failures, configuration errors
- Monitoring gaps, rollback issues

### Priority Classification (P0-P3)

**P0 (Critical)** - Run on every commit:

- Blocks core user journey
- High-risk (score ≥6)
- Revenue-impacting or security-critical

**P1 (High)** - Run on PR to main:

- Important user features
- Medium-risk (score 3-4)
- Common workflows

**P2 (Medium)** - Run nightly/weekly:

- Secondary features
- Low-risk (score 1-2)
- Edge cases

**P3 (Low)** - Run on-demand:

- Nice-to-have, exploratory
- Performance benchmarks

### Test Level Selection

**E2E (End-to-End)**:

- Critical user journeys
- Multi-system integration
- Highest confidence, slowest

**API (Integration)**:

- Service contracts
- Business logic validation
- Fast feedback, stable

**Component**:

- UI component behavior
- Visual regression
- Fast, isolated

**Unit**:

- Business logic, edge cases
- Error handling
- Fastest, most granular

**Key principle**: Avoid duplicate coverage - don't test same behavior at multiple levels.

### Knowledge Base Integration

Automatically consults TEA knowledge base:

- `risk-governance.md` - Risk classification framework
- `probability-impact.md` - Risk scoring methodology
- `test-levels-framework.md` - Test level selection
- `test-priorities-matrix.md` - P0-P3 prioritization

## Integration with Other Workflows

**Before test-design:**

- **plan-project** (Phase 2): Creates PRD and epics
- **solution-architecture** (Phase 3): Defines technical approach
- **tech-spec** (Phase 3): Implementation details

**After test-design:**

- **atdd**: Generate failing tests for P0 scenarios
- **automate**: Expand coverage for P1/P2 scenarios
- **gate**: Use quality gate criteria for release decisions

**Coordinates with:**

- **framework**: Test infrastructure must exist
- **ci**: Execution order maps to CI stages

**Updates:**

- `bmm-workflow-status.md`: Adds test design to Quality & Testing Progress

## Important Notes

### Evidence-Based Assessment

**Critical principle**: Base risk assessment on **evidence**, not speculation.

**Evidence sources:**

- PRD and user research
- Architecture documentation
- Historical bug data
- User feedback
- Security audit results

**When uncertain**: Document assumptions, request user clarification.

**Avoid**:

- Guessing business impact
- Assuming user behavior
- Inventing requirements

### Resource Estimation Formula

```
P0: 2 hours per test (setup + complex scenarios)
P1: 1 hour per test (standard coverage)
P2: 0.5 hours per test (simple scenarios)
P3: 0.25 hours per test (exploratory)

Total Days = Total Hours / 8
```

Example:

- 15 P0 × 2h = 30h
- 25 P1 × 1h = 25h
- 40 P2 × 0.5h = 20h
- **Total: 75 hours (~10 days)**

### Execution Order Strategy

**Smoke tests** (subset of P0, <5 min):

- Login successful
- Dashboard loads
- Core API responds

**Purpose**: Fast feedback, catch build-breaking issues immediately.

**P0 tests** (critical paths, <10 min):

- All scenarios blocking user journeys
- Security-critical flows

**P1 tests** (important features, <30 min):

- Common workflows
- Medium-risk areas

**P2/P3 tests** (full regression, <60 min):

- Edge cases
- Performance benchmarks

### Quality Gate Criteria

**Pass/Fail thresholds:**

- P0: 100% pass (no exceptions)
- P1: ≥95% pass (2-3 failures acceptable with waivers)
- P2/P3: ≥90% pass (informational)
- High-risk items: All mitigated or have approved waivers

**Coverage targets:**

- Critical paths: ≥80%
- Security scenarios: 100%
- Business logic: ≥70%

## Validation Checklist

After workflow completion:

- [ ] Risk assessment complete (all categories)
- [ ] Risks scored (probability × impact)
- [ ] High-priority risks (≥6) flagged
- [ ] Coverage matrix maps requirements to test levels
- [ ] Priorities assigned (P0-P3)
- [ ] Execution order defined
- [ ] Resource estimates provided
- [ ] Quality gate criteria defined
- [ ] Output file created

Refer to `checklist.md` for comprehensive validation.

## Example Execution

**Scenario: E-commerce checkout epic**

```bash
bmad tea *test-design
# Epic 3: Checkout flow redesign

# Risk Assessment identifies:
- R-001 (SEC): Payment bypass, P=2 × I=3 = 6 (HIGH)
- R-002 (PERF): Cart load time, P=3 × I=2 = 6 (HIGH)
- R-003 (BUS): Order confirmation email, P=2 × I=2 = 4 (MEDIUM)

# Coverage Plan:
P0 scenarios: 12 tests (payment security, order creation)
P1 scenarios: 18 tests (cart management, promo codes)
P2 scenarios: 25 tests (edge cases, error handling)

Total effort: 65 hours (~8 days)

# Test Levels:
- E2E: 8 tests (critical checkout path)
- API: 30 tests (business logic, payment processing)
- Unit: 17 tests (calculations, validations)

# Execution Order:
1. Smoke: Payment successful, order created (2 min)
2. P0: All payment & security flows (8 min)
3. P1: Cart & promo codes (20 min)
4. P2: Edge cases (40 min)

# Quality Gates:
- P0 pass rate: 100%
- P1 pass rate: ≥95%
- R-001 mitigated: Add payment validation layer
- R-002 mitigated: Implement cart caching
```

## Troubleshooting

**Issue: "Unable to score risks - missing context"**

- **Cause**: Insufficient documentation
- **Solution**: Request PRD, architecture docs, or user clarification

**Issue: "All tests marked as P0"**

- **Cause**: Over-prioritization
- **Solution**: Apply strict P0 criteria (blocks core journey + high risk + no workaround)

**Issue: "Duplicate coverage at multiple test levels"**

- **Cause**: Not following test pyramid
- **Solution**: Use E2E for critical paths only, API for logic, unit for edge cases

**Issue: "Resource estimates too high"**

- **Cause**: Complex test setup or insufficient automation
- **Solution**: Invest in fixtures/factories upfront, reduce per-test setup time

## Related Workflows

- **atdd**: Generate failing tests → [atdd/README.md](../atdd/README.md)
- **automate**: Expand regression coverage → [automate/README.md](../automate/README.md)
- **gate**: Quality gate decisions → [gate/README.md](../gate/README.md)
- **framework**: Test infrastructure → [framework/README.md](../framework/README.md)

## Version History

- **v4.0 (BMad v6)**: Pure markdown instructions, risk scoring framework, template-based output
- **v3.x**: XML format instructions
- **v2.x**: Legacy task-based approach
-												feat: migrate test architect entirely to v6

											
										
										
											2025-10-14 16:10:20 -05:00
+								# Test Design and Risk Assessment Workflow
 								Plans comprehensive test coverage strategy with risk assessment (probability × impact scoring), priority classification (P0-P3), and resource estimation. This workflow generates a test design document that identifies high-risk areas, maps requirements to appropriate test levels, and provides execution ordering for optimal feedback.
 								## Usage
 								```bash
 								bmad tea *test-design
 								```
 								The TEA agent runs this workflow when:
 								- Planning test coverage before development starts
 								- Assessing risks for an epic or story
 								- Prioritizing test scenarios by business impact
 								- Estimating testing effort and resources
 								## Inputs
 								**Required Context Files:**
 								- **Story markdown**: Acceptance criteria and requirements
 								- **PRD or epics.md**: High-level product context
 								- **Architecture docs** (optional): Technical constraints and integration points
 								**Workflow Variables:**
 								- `epic_num`: Epic number for scoped design
 								- `story_path`: Specific story for design (optional)
 								- `design_level`: full/targeted/minimal (default: full)
 								- `risk_threshold`: Score for high-priority flag (default: 6)
 								- `risk_categories`: TECH,SEC,PERF,DATA,BUS,OPS (all enabled)
 								- `priority_levels`: P0,P1,P2,P3 (all enabled)
 								## Outputs
 								**Primary Deliverable:**
 								**Test Design Document** (`test-design-epic-{N}.md`):
 . **Risk Assessment Matrix**
 								   - Risk ID, category, description
 								   - Probability (1-3) × Impact (1-3) = Score
 								   - Scores ≥6 flagged as high-priority
 								   - Mitigation plans with owners and timelines
 . **Coverage Matrix**
 								   - Requirement → Test Level (E2E/API/Component/Unit)
 								   - Priority assignment (P0-P3)
 								   - Risk linkage
 								   - Test count estimates
 . **Execution Order**
 								   - Smoke tests (P0 subset, <5 min)
 								   - P0 tests (critical paths, <10 min)
 								   - P1 tests (important features, <30 min)
 								   - P2/P3 tests (full regression, <60 min)
 . **Resource Estimates**
 								   - Hours per priority level
 								   - Total effort in days
 								   - Tooling and data prerequisites
 . **Quality Gate Criteria**
 								   - P0 pass rate: 100%
 								   - P1 pass rate: ≥95%
 								   - High-risk mitigations: 100%
 								   - Coverage target: ≥80%
 								## Key Features
 								### Risk Scoring Framework
 								**Probability × Impact = Risk Score**
 								**Probability** (1-3):
 								- 1 (Unlikely): <10% chance
 								- 2 (Possible): 10-50% chance
 								- 3 (Likely): >50% chance
 								**Impact** (1-3):
 								- 1 (Minor): Cosmetic, workaround exists
 								- 2 (Degraded): Feature impaired, difficult workaround
 								- 3 (Critical): System failure, no workaround
 								**Scores**:
 								- 1-2: Low risk (monitor)
 								- 3-4: Medium risk (plan mitigation)
 								- **6-9: High risk** (immediate mitigation required)
 								### Risk Categories (6 types)
 								**TECH** (Technical/Architecture):
 								- Architecture flaws, integration failures
 								- Scalability issues, technical debt
 								**SEC** (Security):
 								- Missing access controls, auth bypass
 								- Data exposure, injection vulnerabilities
 								**PERF** (Performance):
 								- SLA violations, response time degradation
 								- Resource exhaustion, scalability limits
 								**DATA** (Data Integrity):
 								- Data loss/corruption, inconsistent state
 								- Migration failures
 								**BUS** (Business Impact):
 								- UX degradation, business logic errors
 								- Revenue impact, compliance violations
 								**OPS** (Operations):
 								- Deployment failures, configuration errors
 								- Monitoring gaps, rollback issues
 								### Priority Classification (P0-P3)
 								**P0 (Critical)** - Run on every commit:
 								- Blocks core user journey
 								- High-risk (score ≥6)
 								- Revenue-impacting or security-critical
 								**P1 (High)** - Run on PR to main:
 								- Important user features
 								- Medium-risk (score 3-4)
 								- Common workflows
 								**P2 (Medium)** - Run nightly/weekly:
 								- Secondary features
 								- Low-risk (score 1-2)
 								- Edge cases
 								**P3 (Low)** - Run on-demand:
 								- Nice-to-have, exploratory
 								- Performance benchmarks
 								### Test Level Selection
 								**E2E (End-to-End)**:
 								- Critical user journeys
 								- Multi-system integration
 								- Highest confidence, slowest
 								**API (Integration)**:
 								- Service contracts
 								- Business logic validation
 								- Fast feedback, stable
 								**Component**:
 								- UI component behavior
 								- Visual regression
 								- Fast, isolated
 								**Unit**:
 								- Business logic, edge cases
 								- Error handling
 								- Fastest, most granular
 								**Key principle**: Avoid duplicate coverage - don't test same behavior at multiple levels.
 								### Knowledge Base Integration
 								Automatically consults TEA knowledge base:
 								- `risk-governance.md` - Risk classification framework
 								- `probability-impact.md` - Risk scoring methodology
 								- `test-levels-framework.md` - Test level selection
 								- `test-priorities-matrix.md` - P0-P3 prioritization
 								## Integration with Other Workflows
 								**Before test-design:**
 								- **plan-project** (Phase 2): Creates PRD and epics
 								- **solution-architecture** (Phase 3): Defines technical approach
 								- **tech-spec** (Phase 3): Implementation details
 								**After test-design:**
 								- **atdd**: Generate failing tests for P0 scenarios
 								- **automate**: Expand coverage for P1/P2 scenarios
 								- **gate**: Use quality gate criteria for release decisions
 								**Coordinates with:**
 								- **framework**: Test infrastructure must exist
 								- **ci**: Execution order maps to CI stages
 								**Updates:**
 								- `bmm-workflow-status.md`: Adds test design to Quality & Testing Progress
 								## Important Notes
 								### Evidence-Based Assessment
 								**Critical principle**: Base risk assessment on **evidence**, not speculation.
 								**Evidence sources:**
 								- PRD and user research
 								- Architecture documentation
 								- Historical bug data
 								- User feedback
 								- Security audit results
 								**When uncertain**: Document assumptions, request user clarification.
 								**Avoid**:
 								- Guessing business impact
 								- Assuming user behavior
 								- Inventing requirements
 								### Resource Estimation Formula
 								```
 								P0: 2 hours per test (setup + complex scenarios)
 								P1: 1 hour per test (standard coverage)
 								P2: 0.5 hours per test (simple scenarios)
 								P3: 0.25 hours per test (exploratory)
 								Total Days = Total Hours / 8
 								```
 								Example:
 								- 15 P0 × 2h = 30h
 								- 25 P1 × 1h = 25h
 								- 40 P2 × 0.5h = 20h
 								- **Total: 75 hours (~10 days)**
 								### Execution Order Strategy
 								**Smoke tests** (subset of P0, <5 min):
 								- Login successful
 								- Dashboard loads
 								- Core API responds
 								**Purpose**: Fast feedback, catch build-breaking issues immediately.
 								**P0 tests** (critical paths, <10 min):
 								- All scenarios blocking user journeys
 								- Security-critical flows
 								**P1 tests** (important features, <30 min):
 								- Common workflows
 								- Medium-risk areas
 								**P2/P3 tests** (full regression, <60 min):
 								- Edge cases
 								- Performance benchmarks
 								### Quality Gate Criteria
 								**Pass/Fail thresholds:**
 								- P0: 100% pass (no exceptions)
 								- P1: ≥95% pass (2-3 failures acceptable with waivers)
 								- P2/P3: ≥90% pass (informational)
 								- High-risk items: All mitigated or have approved waivers
 								**Coverage targets:**
 								- Critical paths: ≥80%
 								- Security scenarios: 100%
 								- Business logic: ≥70%
 								## Validation Checklist
 								After workflow completion:
 								- [ ] Risk assessment complete (all categories)
 								- [ ] Risks scored (probability × impact)
 								- [ ] High-priority risks (≥6) flagged
 								- [ ] Coverage matrix maps requirements to test levels
 								- [ ] Priorities assigned (P0-P3)
 								- [ ] Execution order defined
 								- [ ] Resource estimates provided
 								- [ ] Quality gate criteria defined
 								- [ ] Output file created
 								Refer to `checklist.md` for comprehensive validation.
 								## Example Execution
 								**Scenario: E-commerce checkout epic**
 								```bash
 								bmad tea *test-design
 								# Epic 3: Checkout flow redesign
 								# Risk Assessment identifies:
 								- R-001 (SEC): Payment bypass, P=2 × I=3 = 6 (HIGH)
 								- R-002 (PERF): Cart load time, P=3 × I=2 = 6 (HIGH)
 								- R-003 (BUS): Order confirmation email, P=2 × I=2 = 4 (MEDIUM)
 								# Coverage Plan:
 								P0 scenarios: 12 tests (payment security, order creation)
 								P1 scenarios: 18 tests (cart management, promo codes)
 								P2 scenarios: 25 tests (edge cases, error handling)
 								Total effort: 65 hours (~8 days)
 								# Test Levels:
 								- E2E: 8 tests (critical checkout path)
 								- API: 30 tests (business logic, payment processing)
 								- Unit: 17 tests (calculations, validations)
 								# Execution Order:
 . Smoke: Payment successful, order created (2 min)
 . P0: All payment & security flows (8 min)
 . P1: Cart & promo codes (20 min)
 . P2: Edge cases (40 min)
 								# Quality Gates:
 								- P0 pass rate: 100%
 								- P1 pass rate: ≥95%
 								- R-001 mitigated: Add payment validation layer
 								- R-002 mitigated: Implement cart caching
 								```
 								## Troubleshooting
 								**Issue: "Unable to score risks - missing context"**
 								- **Cause**: Insufficient documentation
 								- **Solution**: Request PRD, architecture docs, or user clarification
 								**Issue: "All tests marked as P0"**
 								- **Cause**: Over-prioritization
 								- **Solution**: Apply strict P0 criteria (blocks core journey + high risk + no workaround)
 								**Issue: "Duplicate coverage at multiple test levels"**
 								- **Cause**: Not following test pyramid
 								- **Solution**: Use E2E for critical paths only, API for logic, unit for edge cases
 								**Issue: "Resource estimates too high"**
 								- **Cause**: Complex test setup or insufficient automation
 								- **Solution**: Invest in fixtures/factories upfront, reduce per-test setup time
 								## Related Workflows
 								- **atdd**: Generate failing tests → [atdd/README.md](../atdd/README.md)
 								- **automate**: Expand regression coverage → [automate/README.md](../automate/README.md)
 								- **gate**: Quality gate decisions → [gate/README.md](../gate/README.md)
 								- **framework**: Test infrastructure → [framework/README.md](../framework/README.md)
 								## Version History
 								- **v4.0 (BMad v6)**: Pure markdown instructions, risk scoring framework, template-based output
 								- **v3.x**: XML format instructions
 								- **v2.x**: Legacy task-based approach