feat: transform QA agent into Test Architect with advanced quality ca… (#433)

* feat: transform QA agent into Test Architect with advanced quality capabilities

  - Add 6 specialized quality assessment commands
  - Implement risk-based testing with scoring
  - Create quality gate system with deterministic decisions
  - Add comprehensive test design and NFR validation
  - Update documentation with stage-based workflow integration

* feat: transform QA agent into Test Architect with advanced quality capabilities

  - Add 6 specialized quality assessment commands
  - Implement risk-based testing with scoring
  - Create quality gate system with deterministic decisions
  - Add comprehensive test design and NFR validation
  - Update documentation with stage-based workflow integration

* docs: refined the docs for test architect

* fix: addressed review comments from manjaroblack, round 1

* fix: addressed review comments from manjaroblack, round 1

---------

Co-authored-by: Murat Ozcan <murat@mac.lan>
Co-authored-by: Brian <bmadcode@gmail.com>
This commit is contained in:
Murat K Ozcan
2025-08-15 21:02:37 -05:00
committed by GitHub
parent 33269c888d
commit 0b61175d98
76 changed files with 9245 additions and 1442 deletions

View File

@@ -542,7 +542,7 @@ Each status change requires user verification and approval before proceeding.
#### Greenfield Development
- Business analysis and market research
- Product requirements and feature definition
- Product requirements and feature definition
- System architecture and design
- Development execution
- Testing and deployment
@@ -651,8 +651,11 @@ Templates with Level 2 headings (`##`) can be automatically sharded:
```markdown
## Goals and Background Context
## Requirements
## Requirements
## User Interface Design Goals
## Success Metrics
```

View File

@@ -3,16 +3,19 @@
## Core Reflective Methods
**Expand or Contract for Audience**
- Ask whether to 'expand' (add detail, elaborate) or 'contract' (simplify, clarify)
- Identify specific target audience if relevant
- Tailor content complexity and depth accordingly
**Explain Reasoning (CoT Step-by-Step)**
- Walk through the step-by-step thinking process
- Reveal underlying assumptions and decision points
- Show how conclusions were reached from current role's perspective
**Critique and Refine**
- Review output for flaws, inconsistencies, or improvement areas
- Identify specific weaknesses from role's expertise
- Suggest refined version reflecting domain knowledge
@@ -20,12 +23,14 @@
## Structural Analysis Methods
**Analyze Logical Flow and Dependencies**
- Examine content structure for logical progression
- Check internal consistency and coherence
- Identify and validate dependencies between elements
- Confirm effective ordering and sequencing
**Assess Alignment with Overall Goals**
- Evaluate content contribution to stated objectives
- Identify any misalignments or gaps
- Interpret alignment from specific role's perspective
@@ -34,12 +39,14 @@
## Risk and Challenge Methods
**Identify Potential Risks and Unforeseen Issues**
- Brainstorm potential risks from role's expertise
- Identify overlooked edge cases or scenarios
- Anticipate unintended consequences
- Highlight implementation challenges
**Challenge from Critical Perspective**
- Adopt critical stance on current content
- Play devil's advocate from specified viewpoint
- Argue against proposal highlighting weaknesses
@@ -48,12 +55,14 @@
## Creative Exploration Methods
**Tree of Thoughts Deep Dive**
- Break problem into discrete "thoughts" or intermediate steps
- Explore multiple reasoning paths simultaneously
- Use self-evaluation to classify each path as "sure", "likely", or "impossible"
- Apply search algorithms (BFS/DFS) to find optimal solution paths
**Hindsight is 20/20: The 'If Only...' Reflection**
- Imagine retrospective scenario based on current content
- Identify the one "if only we had known/done X..." insight
- Describe imagined consequences humorously or dramatically
@@ -62,6 +71,7 @@
## Multi-Persona Collaboration Methods
**Agile Team Perspective Shift**
- Rotate through different Scrum team member viewpoints
- Product Owner: Focus on user value and business impact
- Scrum Master: Examine process flow and team dynamics
@@ -69,12 +79,14 @@
- QA: Identify testing scenarios and quality concerns
**Stakeholder Round Table**
- Convene virtual meeting with multiple personas
- Each persona contributes unique perspective on content
- Identify conflicts and synergies between viewpoints
- Synthesize insights into actionable recommendations
**Meta-Prompting Analysis**
- Step back to analyze the structure and logic of current approach
- Question the format and methodology being used
- Suggest alternative frameworks or mental models
@@ -83,24 +95,28 @@
## Advanced 2025 Techniques
**Self-Consistency Validation**
- Generate multiple reasoning paths for same problem
- Compare consistency across different approaches
- Identify most reliable and robust solution
- Highlight areas where approaches diverge and why
**ReWOO (Reasoning Without Observation)**
- Separate parametric reasoning from tool-based actions
- Create reasoning plan without external dependencies
- Identify what can be solved through pure reasoning
- Optimize for efficiency and reduced token usage
**Persona-Pattern Hybrid**
- Combine specific role expertise with elicitation pattern
- Architect + Risk Analysis: Deep technical risk assessment
- UX Expert + User Journey: End-to-end experience critique
- PM + Stakeholder Analysis: Multi-perspective impact review
**Emergent Collaboration Discovery**
- Allow multiple perspectives to naturally emerge
- Identify unexpected insights from persona interactions
- Explore novel combinations of viewpoints
@@ -109,18 +125,21 @@
## Game-Based Elicitation Methods
**Red Team vs Blue Team**
- Red Team: Attack the proposal, find vulnerabilities
- Blue Team: Defend and strengthen the approach
- Competitive analysis reveals blind spots
- Results in more robust, battle-tested solutions
**Innovation Tournament**
- Pit multiple alternative approaches against each other
- Score each approach across different criteria
- Crowd-source evaluation from different personas
- Identify winning combination of features
**Escape Room Challenge**
- Present content as constraints to work within
- Find creative solutions within tight limitations
- Identify minimum viable approach
@@ -129,6 +148,7 @@
## Process Control
**Proceed / No Further Actions**
- Acknowledge choice to finalize current work
- Accept output as-is or move to next step
- Prepare to continue without additional elicitation

View File

@@ -0,0 +1,146 @@
# Test Levels Framework
Comprehensive guide for determining appropriate test levels (unit, integration, E2E) for different scenarios.
## Test Level Decision Matrix
### Unit Tests
**When to use:**
- Testing pure functions and business logic
- Algorithm correctness
- Input validation and data transformation
- Error handling in isolated components
- Complex calculations or state machines
**Characteristics:**
- Fast execution (immediate feedback)
- No external dependencies (DB, API, file system)
- Highly maintainable and stable
- Easy to debug failures
**Example scenarios:**
```yaml
unit_test:
component: "PriceCalculator"
scenario: "Calculate discount with multiple rules"
justification: "Complex business logic with multiple branches"
mock_requirements: "None - pure function"
```
### Integration Tests
**When to use:**
- Component interaction verification
- Database operations and transactions
- API endpoint contracts
- Service-to-service communication
- Middleware and interceptor behavior
**Characteristics:**
- Moderate execution time
- Tests component boundaries
- May use test databases or containers
- Validates system integration points
**Example scenarios:**
```yaml
integration_test:
components: ["UserService", "AuthRepository"]
scenario: "Create user with role assignment"
justification: "Critical data flow between service and persistence"
test_environment: "In-memory database"
```
### End-to-End Tests
**When to use:**
- Critical user journeys
- Cross-system workflows
- Visual regression testing
- Compliance and regulatory requirements
- Final validation before release
**Characteristics:**
- Slower execution
- Tests complete workflows
- Requires full environment setup
- Most realistic but most brittle
**Example scenarios:**
```yaml
e2e_test:
journey: "Complete checkout process"
scenario: "User purchases with saved payment method"
justification: "Revenue-critical path requiring full validation"
environment: "Staging with test payment gateway"
```
## Test Level Selection Rules
### Favor Unit Tests When:
- Logic can be isolated
- No side effects involved
- Fast feedback needed
- High cyclomatic complexity
### Favor Integration Tests When:
- Testing persistence layer
- Validating service contracts
- Testing middleware/interceptors
- Component boundaries critical
### Favor E2E Tests When:
- User-facing critical paths
- Multi-system interactions
- Regulatory compliance scenarios
- Visual regression important
## Anti-patterns to Avoid
- E2E testing for business logic validation
- Unit testing framework behavior
- Integration testing third-party libraries
- Duplicate coverage across levels
## Duplicate Coverage Guard
**Before adding any test, check:**
1. Is this already tested at a lower level?
2. Can a unit test cover this instead of integration?
3. Can an integration test cover this instead of E2E?
**Coverage overlap is only acceptable when:**
- Testing different aspects (unit: logic, integration: interaction, e2e: user experience)
- Critical paths requiring defense in depth
- Regression prevention for previously broken functionality
## Test Naming Conventions
- Unit: `test_{component}_{scenario}`
- Integration: `test_{flow}_{interaction}`
- E2E: `test_{journey}_{outcome}`
## Test ID Format
`{EPIC}.{STORY}-{LEVEL}-{SEQ}`
Examples:
- `1.3-UNIT-001`
- `1.3-INT-002`
- `1.3-E2E-001`

View File

@@ -0,0 +1,172 @@
# Test Priorities Matrix
Guide for prioritizing test scenarios based on risk, criticality, and business impact.
## Priority Levels
### P0 - Critical (Must Test)
**Criteria:**
- Revenue-impacting functionality
- Security-critical paths
- Data integrity operations
- Regulatory compliance requirements
- Previously broken functionality (regression prevention)
**Examples:**
- Payment processing
- Authentication/authorization
- User data creation/deletion
- Financial calculations
- GDPR/privacy compliance
**Testing Requirements:**
- Comprehensive coverage at all levels
- Both happy and unhappy paths
- Edge cases and error scenarios
- Performance under load
### P1 - High (Should Test)
**Criteria:**
- Core user journeys
- Frequently used features
- Features with complex logic
- Integration points between systems
- Features affecting user experience
**Examples:**
- User registration flow
- Search functionality
- Data import/export
- Notification systems
- Dashboard displays
**Testing Requirements:**
- Primary happy paths required
- Key error scenarios
- Critical edge cases
- Basic performance validation
### P2 - Medium (Nice to Test)
**Criteria:**
- Secondary features
- Admin functionality
- Reporting features
- Configuration options
- UI polish and aesthetics
**Examples:**
- Admin settings panels
- Report generation
- Theme customization
- Help documentation
- Analytics tracking
**Testing Requirements:**
- Happy path coverage
- Basic error handling
- Can defer edge cases
### P3 - Low (Test if Time Permits)
**Criteria:**
- Rarely used features
- Nice-to-have functionality
- Cosmetic issues
- Non-critical optimizations
**Examples:**
- Advanced preferences
- Legacy feature support
- Experimental features
- Debug utilities
**Testing Requirements:**
- Smoke tests only
- Can rely on manual testing
- Document known limitations
## Risk-Based Priority Adjustments
### Increase Priority When:
- High user impact (affects >50% of users)
- High financial impact (>$10K potential loss)
- Security vulnerability potential
- Compliance/legal requirements
- Customer-reported issues
- Complex implementation (>500 LOC)
- Multiple system dependencies
### Decrease Priority When:
- Feature flag protected
- Gradual rollout planned
- Strong monitoring in place
- Easy rollback capability
- Low usage metrics
- Simple implementation
- Well-isolated component
## Test Coverage by Priority
| Priority | Unit Coverage | Integration Coverage | E2E Coverage |
| -------- | ------------- | -------------------- | ------------------ |
| P0 | >90% | >80% | All critical paths |
| P1 | >80% | >60% | Main happy paths |
| P2 | >60% | >40% | Smoke tests |
| P3 | Best effort | Best effort | Manual only |
## Priority Assignment Rules
1. **Start with business impact** - What happens if this fails?
2. **Consider probability** - How likely is failure?
3. **Factor in detectability** - Would we know if it failed?
4. **Account for recoverability** - Can we fix it quickly?
## Priority Decision Tree
```
Is it revenue-critical?
├─ YES → P0
└─ NO → Does it affect core user journey?
├─ YES → Is it high-risk?
│ ├─ YES → P0
│ └─ NO → P1
└─ NO → Is it frequently used?
├─ YES → P1
└─ NO → Is it customer-facing?
├─ YES → P2
└─ NO → P3
```
## Test Execution Order
1. Execute P0 tests first (fail fast on critical issues)
2. Execute P1 tests second (core functionality)
3. Execute P2 tests if time permits
4. P3 tests only in full regression cycles
## Continuous Adjustment
Review and adjust priorities based on:
- Production incident patterns
- User feedback and complaints
- Usage analytics
- Test failure history
- Business priority changes