feat: transform QA agent into Test Architect with advanced quality ca… (#433)

* feat: transform QA agent into Test Architect with advanced quality capabilities - Add 6 specialized quality assessment commands - Implement risk-based testing with scoring - Create quality gate system with deterministic decisions - Add comprehensive test design and NFR validation - Update documentation with stage-based workflow integration * feat: transform QA agent into Test Architect with advanced quality capabilities - Add 6 specialized quality assessment commands - Implement risk-based testing with scoring - Create quality gate system with deterministic decisions - Add comprehensive test design and NFR validation - Update documentation with stage-based workflow integration * docs: refined the docs for test architect * fix: addressed review comments from manjaroblack, round 1 * fix: addressed review comments from manjaroblack, round 1 --------- Co-authored-by: Murat Ozcan <murat@mac.lan> Co-authored-by: Brian <bmadcode@gmail.com>
2025-12-29 16:14:59 +00:00 · 2025-08-15 21:02:37 -05:00
parent 33269c888d
commit 0b61175d98
76 changed files with 9245 additions and 1442 deletions
--- a/bmad-core/data/bmad-kb.md
+++ b/bmad-core/data/bmad-kb.md
@@ -542,7 +542,7 @@ Each status change requires user verification and approval before proceeding.
 #### Greenfield Development

 - Business analysis and market research
- Product requirements and feature definition  
+- Product requirements and feature definition
 - System architecture and design
 - Development execution
 - Testing and deployment
@@ -651,8 +651,11 @@ Templates with Level 2 headings (`##`) can be automatically sharded:

 ```markdown
 ## Goals and Background Context
-## Requirements  
+
+## Requirements
+
 ## User Interface Design Goals
+
 ## Success Metrics
 ```

--- a/bmad-core/data/elicitation-methods.md
+++ b/bmad-core/data/elicitation-methods.md
@@ -3,16 +3,19 @@
 ## Core Reflective Methods

 **Expand or Contract for Audience**
+
 - Ask whether to 'expand' (add detail, elaborate) or 'contract' (simplify, clarify)
 - Identify specific target audience if relevant
 - Tailor content complexity and depth accordingly

 **Explain Reasoning (CoT Step-by-Step)**
+
 - Walk through the step-by-step thinking process
 - Reveal underlying assumptions and decision points
 - Show how conclusions were reached from current role's perspective

 **Critique and Refine**
+
 - Review output for flaws, inconsistencies, or improvement areas
 - Identify specific weaknesses from role's expertise
 - Suggest refined version reflecting domain knowledge
@@ -20,12 +23,14 @@
 ## Structural Analysis Methods

 **Analyze Logical Flow and Dependencies**
+
 - Examine content structure for logical progression
 - Check internal consistency and coherence
 - Identify and validate dependencies between elements
 - Confirm effective ordering and sequencing

 **Assess Alignment with Overall Goals**
+
 - Evaluate content contribution to stated objectives
 - Identify any misalignments or gaps
 - Interpret alignment from specific role's perspective
@@ -34,12 +39,14 @@
 ## Risk and Challenge Methods

 **Identify Potential Risks and Unforeseen Issues**
+
 - Brainstorm potential risks from role's expertise
 - Identify overlooked edge cases or scenarios
 - Anticipate unintended consequences
 - Highlight implementation challenges

 **Challenge from Critical Perspective**
+
 - Adopt critical stance on current content
 - Play devil's advocate from specified viewpoint
 - Argue against proposal highlighting weaknesses
@@ -48,12 +55,14 @@
 ## Creative Exploration Methods

 **Tree of Thoughts Deep Dive**
+
 - Break problem into discrete "thoughts" or intermediate steps
 - Explore multiple reasoning paths simultaneously
 - Use self-evaluation to classify each path as "sure", "likely", or "impossible"
 - Apply search algorithms (BFS/DFS) to find optimal solution paths

 **Hindsight is 20/20: The 'If Only...' Reflection**
+
 - Imagine retrospective scenario based on current content
 - Identify the one "if only we had known/done X..." insight
 - Describe imagined consequences humorously or dramatically
@@ -62,6 +71,7 @@
 ## Multi-Persona Collaboration Methods

 **Agile Team Perspective Shift**
+
 - Rotate through different Scrum team member viewpoints
 - Product Owner: Focus on user value and business impact
 - Scrum Master: Examine process flow and team dynamics
@@ -69,12 +79,14 @@
 - QA: Identify testing scenarios and quality concerns

 **Stakeholder Round Table**
+
 - Convene virtual meeting with multiple personas
 - Each persona contributes unique perspective on content
 - Identify conflicts and synergies between viewpoints
 - Synthesize insights into actionable recommendations

 **Meta-Prompting Analysis**
+
 - Step back to analyze the structure and logic of current approach
 - Question the format and methodology being used
 - Suggest alternative frameworks or mental models
@@ -83,24 +95,28 @@
 ## Advanced 2025 Techniques

 **Self-Consistency Validation**
+
 - Generate multiple reasoning paths for same problem
 - Compare consistency across different approaches
 - Identify most reliable and robust solution
 - Highlight areas where approaches diverge and why

 **ReWOO (Reasoning Without Observation)**
+
 - Separate parametric reasoning from tool-based actions
 - Create reasoning plan without external dependencies
 - Identify what can be solved through pure reasoning
 - Optimize for efficiency and reduced token usage

 **Persona-Pattern Hybrid**
+
 - Combine specific role expertise with elicitation pattern
 - Architect + Risk Analysis: Deep technical risk assessment
 - UX Expert + User Journey: End-to-end experience critique
 - PM + Stakeholder Analysis: Multi-perspective impact review

 **Emergent Collaboration Discovery**
+
 - Allow multiple perspectives to naturally emerge
 - Identify unexpected insights from persona interactions
 - Explore novel combinations of viewpoints
@@ -109,18 +125,21 @@
 ## Game-Based Elicitation Methods

 **Red Team vs Blue Team**
+
 - Red Team: Attack the proposal, find vulnerabilities
 - Blue Team: Defend and strengthen the approach
 - Competitive analysis reveals blind spots
 - Results in more robust, battle-tested solutions

 **Innovation Tournament**
+
 - Pit multiple alternative approaches against each other
 - Score each approach across different criteria
 - Crowd-source evaluation from different personas
 - Identify winning combination of features

 **Escape Room Challenge**
+
 - Present content as constraints to work within
 - Find creative solutions within tight limitations
 - Identify minimum viable approach
@@ -129,6 +148,7 @@
 ## Process Control

 **Proceed / No Further Actions**
+
 - Acknowledge choice to finalize current work
 - Accept output as-is or move to next step
 - Prepare to continue without additional elicitation
--- a/bmad-core/data/test-levels-framework.md
+++ b/bmad-core/data/test-levels-framework.md
@@ -0,0 +1,146 @@
+# Test Levels Framework
+
+Comprehensive guide for determining appropriate test levels (unit, integration, E2E) for different scenarios.
+
+## Test Level Decision Matrix
+
+### Unit Tests
+
+**When to use:**
+
+- Testing pure functions and business logic
+- Algorithm correctness
+- Input validation and data transformation
+- Error handling in isolated components
+- Complex calculations or state machines
+
+**Characteristics:**
+
+- Fast execution (immediate feedback)
+- No external dependencies (DB, API, file system)
+- Highly maintainable and stable
+- Easy to debug failures
+
+**Example scenarios:**
+
+```yaml
+unit_test:
+  component: "PriceCalculator"
+  scenario: "Calculate discount with multiple rules"
+  justification: "Complex business logic with multiple branches"
+  mock_requirements: "None - pure function"
+```
+
+### Integration Tests
+
+**When to use:**
+
+- Component interaction verification
+- Database operations and transactions
+- API endpoint contracts
+- Service-to-service communication
+- Middleware and interceptor behavior
+
+**Characteristics:**
+
+- Moderate execution time
+- Tests component boundaries
+- May use test databases or containers
+- Validates system integration points
+
+**Example scenarios:**
+
+```yaml
+integration_test:
+  components: ["UserService", "AuthRepository"]
+  scenario: "Create user with role assignment"
+  justification: "Critical data flow between service and persistence"
+  test_environment: "In-memory database"
+```
+
+### End-to-End Tests
+
+**When to use:**
+
+- Critical user journeys
+- Cross-system workflows
+- Visual regression testing
+- Compliance and regulatory requirements
+- Final validation before release
+
+**Characteristics:**
+
+- Slower execution
+- Tests complete workflows
+- Requires full environment setup
+- Most realistic but most brittle
+
+**Example scenarios:**
+
+```yaml
+e2e_test:
+  journey: "Complete checkout process"
+  scenario: "User purchases with saved payment method"
+  justification: "Revenue-critical path requiring full validation"
+  environment: "Staging with test payment gateway"
+```
+
+## Test Level Selection Rules
+
+### Favor Unit Tests When:
+
+- Logic can be isolated
+- No side effects involved
+- Fast feedback needed
+- High cyclomatic complexity
+
+### Favor Integration Tests When:
+
+- Testing persistence layer
+- Validating service contracts
+- Testing middleware/interceptors
+- Component boundaries critical
+
+### Favor E2E Tests When:
+
+- User-facing critical paths
+- Multi-system interactions
+- Regulatory compliance scenarios
+- Visual regression important
+
+## Anti-patterns to Avoid
+
+- E2E testing for business logic validation
+- Unit testing framework behavior
+- Integration testing third-party libraries
+- Duplicate coverage across levels
+
+## Duplicate Coverage Guard
+
+**Before adding any test, check:**
+
+1. Is this already tested at a lower level?
+2. Can a unit test cover this instead of integration?
+3. Can an integration test cover this instead of E2E?
+
+**Coverage overlap is only acceptable when:**
+
+- Testing different aspects (unit: logic, integration: interaction, e2e: user experience)
+- Critical paths requiring defense in depth
+- Regression prevention for previously broken functionality
+
+## Test Naming Conventions
+
+- Unit: `test_{component}_{scenario}`
+- Integration: `test_{flow}_{interaction}`
+- E2E: `test_{journey}_{outcome}`
+
+## Test ID Format
+
+`{EPIC}.{STORY}-{LEVEL}-{SEQ}`
+
+Examples:
+
+- `1.3-UNIT-001`
+- `1.3-INT-002`
+- `1.3-E2E-001`
--- a/bmad-core/data/test-priorities-matrix.md
+++ b/bmad-core/data/test-priorities-matrix.md
@@ -0,0 +1,172 @@
+# Test Priorities Matrix
+
+Guide for prioritizing test scenarios based on risk, criticality, and business impact.
+
+## Priority Levels
+
+### P0 - Critical (Must Test)
+
+**Criteria:**
+
+- Revenue-impacting functionality
+- Security-critical paths
+- Data integrity operations
+- Regulatory compliance requirements
+- Previously broken functionality (regression prevention)
+
+**Examples:**
+
+- Payment processing
+- Authentication/authorization
+- User data creation/deletion
+- Financial calculations
+- GDPR/privacy compliance
+
+**Testing Requirements:**
+
+- Comprehensive coverage at all levels
+- Both happy and unhappy paths
+- Edge cases and error scenarios
+- Performance under load
+
+### P1 - High (Should Test)
+
+**Criteria:**
+
+- Core user journeys
+- Frequently used features
+- Features with complex logic
+- Integration points between systems
+- Features affecting user experience
+
+**Examples:**
+
+- User registration flow
+- Search functionality
+- Data import/export
+- Notification systems
+- Dashboard displays
+
+**Testing Requirements:**
+
+- Primary happy paths required
+- Key error scenarios
+- Critical edge cases
+- Basic performance validation
+
+### P2 - Medium (Nice to Test)
+
+**Criteria:**
+
+- Secondary features
+- Admin functionality
+- Reporting features
+- Configuration options
+- UI polish and aesthetics
+
+**Examples:**
+
+- Admin settings panels
+- Report generation
+- Theme customization
+- Help documentation
+- Analytics tracking
+
+**Testing Requirements:**
+
+- Happy path coverage
+- Basic error handling
+- Can defer edge cases
+
+### P3 - Low (Test if Time Permits)
+
+**Criteria:**
+
+- Rarely used features
+- Nice-to-have functionality
+- Cosmetic issues
+- Non-critical optimizations
+
+**Examples:**
+
+- Advanced preferences
+- Legacy feature support
+- Experimental features
+- Debug utilities
+
+**Testing Requirements:**
+
+- Smoke tests only
+- Can rely on manual testing
+- Document known limitations
+
+## Risk-Based Priority Adjustments
+
+### Increase Priority When:
+
+- High user impact (affects >50% of users)
+- High financial impact (>$10K potential loss)
+- Security vulnerability potential
+- Compliance/legal requirements
+- Customer-reported issues
+- Complex implementation (>500 LOC)
+- Multiple system dependencies
+
+### Decrease Priority When:
+
+- Feature flag protected
+- Gradual rollout planned
+- Strong monitoring in place
+- Easy rollback capability
+- Low usage metrics
+- Simple implementation
+- Well-isolated component
+
+## Test Coverage by Priority
+
+| Priority | Unit Coverage | Integration Coverage | E2E Coverage       |
+| -------- | ------------- | -------------------- | ------------------ |
+| P0       | >90%          | >80%                 | All critical paths |
+| P1       | >80%          | >60%                 | Main happy paths   |
+| P2       | >60%          | >40%                 | Smoke tests        |
+| P3       | Best effort   | Best effort          | Manual only        |
+
+## Priority Assignment Rules
+
+1. **Start with business impact** - What happens if this fails?
+2. **Consider probability** - How likely is failure?
+3. **Factor in detectability** - Would we know if it failed?
+4. **Account for recoverability** - Can we fix it quickly?
+
+## Priority Decision Tree
+
+```
+Is it revenue-critical?
+├─ YES → P0
+└─ NO → Does it affect core user journey?
+    ├─ YES → Is it high-risk?
+    │   ├─ YES → P0
+    │   └─ NO → P1
+    └─ NO → Is it frequently used?
+        ├─ YES → P1
+        └─ NO → Is it customer-facing?
+            ├─ YES → P2
+            └─ NO → P3
+```
+
+## Test Execution Order
+
+1. Execute P0 tests first (fail fast on critical issues)
+2. Execute P1 tests second (core functionality)
+3. Execute P2 tests if time permits
+4. P3 tests only in full regression cycles
+
+## Continuous Adjustment
+
+Review and adjust priorities based on:
+
+- Production incident patterns
+- User feedback and complaints
+- Usage analytics
+- Test failure history
+- Business priority changes