mirror of
https://github.com/bmadcode/BMAD-METHOD.git
synced 2025-12-29 16:14:59 +00:00
docs: massive documentation overhaul + introduce Paige (Documentation Guide agent)
## 📚 Complete Documentation Restructure **BMM Documentation Hub Created:** - New centralized documentation system at `src/modules/bmm/docs/` - 18 comprehensive guides organized by topic (7000+ lines total) - Clear learning paths for greenfield, brownfield, and quick spec flows - Professional technical writing standards throughout **New Documentation:** - `README.md` - Complete documentation hub with navigation - `quick-start.md` - 15-minute getting started guide - `agents-guide.md` - Comprehensive 12-agent reference (45 min read) - `party-mode.md` - Multi-agent collaboration guide (20 min read) - `scale-adaptive-system.md` - Deep dive on Levels 0-4 (42 min read) - `brownfield-guide.md` - Existing codebase development (53 min read) - `quick-spec-flow.md` - Rapid Level 0-1 development (26 min read) - `workflows-analysis.md` - Phase 1 workflows (12 min read) - `workflows-planning.md` - Phase 2 workflows (19 min read) - `workflows-solutioning.md` - Phase 3 workflows (13 min read) - `workflows-implementation.md` - Phase 4 workflows (33 min read) - `workflows-testing.md` - Testing & QA workflows (29 min read) - `workflow-architecture-reference.md` - Architecture workflow deep-dive - `workflow-document-project-reference.md` - Document-project workflow reference - `enterprise-agentic-development.md` - Team collaboration patterns - `faq.md` - Comprehensive Q&A covering all topics - `glossary.md` - Complete terminology reference - `troubleshooting.md` - Common issues and solutions **Documentation Improvements:** - Removed all version/date footers (git handles versioning) - Agent customization docs now include full rebuild process - Cross-referenced links between all guides - Reading time estimates for all major docs - Consistent professional formatting and structure **Consolidated & Streamlined:** - Module README (`src/modules/bmm/README.md`) streamlined to lean signpost - Root README polished with better hierarchy and clear CTAs - Moved docs from root `docs/` to module-specific locations - Better separation of user docs vs. developer reference ## 🤖 New Agent: Paige (Documentation Guide) **Role:** Technical documentation specialist and information architect **Expertise:** - Professional technical writing standards - Documentation structure and organization - Information architecture and navigation - User-focused content design - Style guide enforcement **Status:** Work in progress - Paige will evolve as documentation needs grow **Integration:** - Listed in agents-guide.md, glossary.md, FAQ - Available for all phases (documentation is continuous) - Can be customized like all BMM agents ## 🔧 Additional Changes - Updated agent manifest with Paige - Updated workflow manifest with new documentation workflows - Fixed workflow-to-agent mappings across all guides - Improved root README with clearer Quick Start section - Better module structure explanations - Enhanced community links with Discord channel names **Total Impact:** - 18 new/restructured documentation files - 7000+ lines of professional technical documentation - Complete navigation system with cross-references - Clear learning paths for all user types - Foundation for knowledge base (coming in beta) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -1,26 +0,0 @@
|
||||
# Test Architect Workflows
|
||||
|
||||
This directory houses the per-command workflows used by the Test Architect agent (`tea`). Each workflow wraps the standalone instructions that used to live under `testarch/` so they can run through the standard BMAD workflow runner.
|
||||
|
||||
## Available workflows
|
||||
|
||||
- `framework` – scaffolds Playwright/Cypress harnesses.
|
||||
- `atdd` – generates failing acceptance tests before coding.
|
||||
- `automate` – expands regression coverage after implementation.
|
||||
- `ci` – bootstraps CI/CD pipelines aligned with TEA practices.
|
||||
- `test-design` – combines risk assessment and coverage planning.
|
||||
- `trace` – maps requirements to tests (Phase 1) and makes quality gate decisions (Phase 2).
|
||||
- `nfr-assess` – evaluates non-functional requirements.
|
||||
- `test-review` – reviews test quality using knowledge base patterns and generates quality score.
|
||||
|
||||
**Note**: The `gate` workflow has been merged into `trace` as Phase 2. The `*trace` command now performs both requirements-to-tests traceability mapping AND quality gate decision (PASS/CONCERNS/FAIL/WAIVED) in a single atomic operation.
|
||||
|
||||
Each subdirectory contains:
|
||||
|
||||
- `README.md` – comprehensive workflow documentation with usage, inputs, outputs, and integration notes.
|
||||
- `instructions.md` – detailed workflow steps in pure markdown v4.0 format.
|
||||
- `workflow.yaml` – metadata, variables, and configuration for BMAD workflow runner.
|
||||
- `checklist.md` – validation checklist for quality assurance and completeness verification.
|
||||
- `template.md` – output template for workflow deliverables (where applicable).
|
||||
|
||||
The TEA agent now invokes these workflows via `run-workflow` rather than executing instruction files directly.
|
||||
@@ -1,672 +0,0 @@
|
||||
# ATDD (Acceptance Test-Driven Development) Workflow
|
||||
|
||||
Generates failing acceptance tests BEFORE implementation following TDD's red-green-refactor cycle. Creates comprehensive test coverage at appropriate levels (E2E, API, Component) with supporting infrastructure (fixtures, factories, mocks) and provides an implementation checklist to guide development toward passing tests.
|
||||
|
||||
**Core Principle**: Tests fail first (red phase), guide development to green, then enable confident refactoring.
|
||||
|
||||
## Usage
|
||||
|
||||
```bash
|
||||
bmad tea *atdd
|
||||
```
|
||||
|
||||
The TEA agent runs this workflow when:
|
||||
|
||||
- User story is approved with clear acceptance criteria
|
||||
- Development is about to begin (before any implementation code)
|
||||
- Team is practicing Test-Driven Development (TDD)
|
||||
- Need to establish test-first contract with DEV team
|
||||
|
||||
## Inputs
|
||||
|
||||
**Required Context Files:**
|
||||
|
||||
- **Story markdown** (`{story_file}`): User story with acceptance criteria, functional requirements, and technical constraints
|
||||
- **Framework configuration**: Test framework config (playwright.config.ts or cypress.config.ts) from framework workflow
|
||||
|
||||
**Workflow Variables:**
|
||||
|
||||
- `story_file`: Path to story markdown with acceptance criteria (required)
|
||||
- `test_dir`: Directory for test files (default: `{project-root}/tests`)
|
||||
- `test_framework`: Detected from framework workflow (playwright or cypress)
|
||||
- `test_levels`: Which test levels to generate (default: "e2e,api,component")
|
||||
- `primary_level`: Primary test level for acceptance criteria (default: "e2e")
|
||||
- `start_failing`: Tests must fail initially - red phase (default: true)
|
||||
- `use_given_when_then`: BDD-style test structure (default: true)
|
||||
- `network_first`: Route interception before navigation to prevent race conditions (default: true)
|
||||
- `one_assertion_per_test`: Atomic test design (default: true)
|
||||
- `generate_factories`: Create data factory stubs using faker (default: true)
|
||||
- `generate_fixtures`: Create fixture architecture with auto-cleanup (default: true)
|
||||
- `auto_cleanup`: Fixtures clean up their data automatically (default: true)
|
||||
- `include_data_testids`: List required data-testid attributes for DEV (default: true)
|
||||
- `include_mock_requirements`: Document mock/stub needs (default: true)
|
||||
- `auto_load_knowledge`: Load fixture-architecture, data-factories, component-tdd fragments (default: true)
|
||||
- `share_with_dev`: Provide implementation checklist to DEV agent (default: true)
|
||||
- `output_checklist`: Path for implementation checklist (default: `{output_folder}/atdd-checklist-{story_id}.md`)
|
||||
|
||||
**Optional Context:**
|
||||
|
||||
- **Test design document**: For risk/priority context alignment (P0-P3 scenarios)
|
||||
- **Existing fixtures/helpers**: For consistency with established patterns
|
||||
- **Architecture documents**: For understanding system boundaries and integration points
|
||||
|
||||
## Outputs
|
||||
|
||||
**Primary Deliverable:**
|
||||
|
||||
- **ATDD Checklist** (`atdd-checklist-{story_id}.md`): Implementation guide containing:
|
||||
- Story summary and acceptance criteria breakdown
|
||||
- Test files created with paths and line counts
|
||||
- Data factories created with patterns
|
||||
- Fixtures created with auto-cleanup logic
|
||||
- Mock requirements for external services
|
||||
- Required data-testid attributes list
|
||||
- Implementation checklist mapping tests to code tasks
|
||||
- Red-green-refactor workflow guidance
|
||||
- Execution commands for running tests
|
||||
|
||||
**Test Files Created:**
|
||||
|
||||
- **E2E tests** (`tests/e2e/{feature-name}.spec.ts`): Full user journey tests for critical paths
|
||||
- **API tests** (`tests/api/{feature-name}.api.spec.ts`): Business logic and service contract tests
|
||||
- **Component tests** (`tests/component/{ComponentName}.test.tsx`): UI component behavior tests
|
||||
|
||||
**Supporting Infrastructure:**
|
||||
|
||||
- **Data factories** (`tests/support/factories/{entity}.factory.ts`): Factory functions using @faker-js/faker for generating test data with overrides support
|
||||
- **Test fixtures** (`tests/support/fixtures/{feature}.fixture.ts`): Playwright fixtures with setup/teardown and auto-cleanup
|
||||
- **Mock/stub documentation**: Requirements for external service mocking (payment gateways, email services, etc.)
|
||||
- **data-testid requirements**: List of required test IDs for stable selectors in UI implementation
|
||||
|
||||
**Validation Safeguards:**
|
||||
|
||||
- All tests must fail initially (red phase verified by local test run)
|
||||
- Failure messages are clear and actionable
|
||||
- Tests use Given-When-Then format for readability
|
||||
- Network-first pattern applied (route interception before navigation)
|
||||
- One assertion per test (atomic test design)
|
||||
- No hard waits or sleeps (explicit waits only)
|
||||
|
||||
## Key Features
|
||||
|
||||
### Red-Green-Refactor Cycle
|
||||
|
||||
**RED Phase** (TEA Agent responsibility):
|
||||
|
||||
- Write failing tests first defining expected behavior
|
||||
- Tests fail for right reason (missing implementation, not test bugs)
|
||||
- All supporting infrastructure (factories, fixtures, mocks) created
|
||||
|
||||
**GREEN Phase** (DEV Agent responsibility):
|
||||
|
||||
- Implement minimal code to pass one test at a time
|
||||
- Use implementation checklist as guide
|
||||
- Run tests frequently to verify progress
|
||||
|
||||
**REFACTOR Phase** (DEV Agent responsibility):
|
||||
|
||||
- Improve code quality with confidence (tests provide safety net)
|
||||
- Extract duplications, optimize performance
|
||||
- Ensure tests still pass after changes
|
||||
|
||||
### Test Level Selection Framework
|
||||
|
||||
**E2E (End-to-End)**:
|
||||
|
||||
- Critical user journeys (login, checkout, core workflows)
|
||||
- Multi-system integration
|
||||
- User-facing acceptance criteria
|
||||
- Characteristics: High confidence, slow execution, brittle
|
||||
|
||||
**API (Integration)**:
|
||||
|
||||
- Business logic validation
|
||||
- Service contracts and data transformations
|
||||
- Backend integration without UI
|
||||
- Characteristics: Fast feedback, good balance, stable
|
||||
|
||||
**Component**:
|
||||
|
||||
- UI component behavior (buttons, forms, modals)
|
||||
- Interaction testing (click, hover, keyboard navigation)
|
||||
- Visual regression and state management
|
||||
- Characteristics: Fast, isolated, granular
|
||||
|
||||
**Unit**:
|
||||
|
||||
- Pure business logic and algorithms
|
||||
- Edge cases and error handling
|
||||
- Minimal dependencies
|
||||
- Characteristics: Fastest, most granular
|
||||
|
||||
**Selection Strategy**: Avoid duplicate coverage. Use E2E for critical happy path, API for business logic variations, component for UI edge cases, unit for pure logic.
|
||||
|
||||
### Recording Mode (NEW - Phase 2.5)
|
||||
|
||||
**atdd** can record complex UI interactions instead of AI generation.
|
||||
|
||||
**Activation**: Automatic for complex UI when config.tea_use_mcp_enhancements is true and MCP available
|
||||
|
||||
- Fallback: AI generation (silent, automatic)
|
||||
|
||||
**When to Use Recording Mode:**
|
||||
|
||||
- ✅ Complex UI interactions (drag-drop, multi-step forms, wizards)
|
||||
- ✅ Visual workflows (modals, dialogs, animations)
|
||||
- ✅ Unclear requirements (exploratory, discovering expected behavior)
|
||||
- ✅ Multi-page flows (checkout, registration, onboarding)
|
||||
- ❌ NOT for simple CRUD (AI generation faster)
|
||||
- ❌ NOT for API-only tests (no UI to record)
|
||||
|
||||
**When to Use AI Generation (Default):**
|
||||
|
||||
- ✅ Clear acceptance criteria available
|
||||
- ✅ Standard patterns (login, CRUD, navigation)
|
||||
- ✅ Need many tests quickly
|
||||
- ✅ API/backend tests (no UI interaction)
|
||||
|
||||
**How Test Generation Works (Default - AI-Based):**
|
||||
|
||||
TEA generates tests using AI by:
|
||||
|
||||
1. **Analyzing acceptance criteria** from story markdown
|
||||
2. **Inferring selectors** from requirement descriptions (e.g., "login button" → `[data-testid="login-button"]`)
|
||||
3. **Synthesizing test code** based on knowledge base patterns
|
||||
4. **Estimating interactions** using common UI patterns (click, type, verify)
|
||||
5. **Applying best practices** from knowledge fragments (Given-When-Then, network-first, fixtures)
|
||||
|
||||
**This works well for:**
|
||||
|
||||
- ✅ Clear requirements with known UI patterns
|
||||
- ✅ Standard workflows (login, CRUD, navigation)
|
||||
- ✅ When selectors follow conventions (data-testid attributes)
|
||||
|
||||
**What MCP Adds (Interactive Verification & Enhancement):**
|
||||
|
||||
When Playwright MCP is available, TEA **additionally**:
|
||||
|
||||
1. **Verifies generated tests** by:
|
||||
- **Launching real browser** with `generator_setup_page`
|
||||
- **Executing generated test steps** with `browser_*` tools (`navigate`, `click`, `type`)
|
||||
- **Seeing actual UI** with `browser_snapshot` (visual verification)
|
||||
- **Discovering real selectors** with `browser_generate_locator` (auto-generate from live DOM)
|
||||
|
||||
2. **Enhances AI-generated tests** by:
|
||||
- **Validating selectors exist** in actual DOM (not just guesses)
|
||||
- **Verifying behavior** with `browser_verify_text`, `browser_verify_visible`, `browser_verify_url`
|
||||
- **Capturing actual interaction log** with `generator_read_log`
|
||||
- **Refining test code** with real observed behavior
|
||||
|
||||
3. **Catches issues early** by:
|
||||
- **Finding missing selectors** before DEV implements (requirements clarification)
|
||||
- **Discovering edge cases** not in requirements (loading states, error messages)
|
||||
- **Validating assumptions** about UI structure and behavior
|
||||
|
||||
**Key Benefits of MCP Enhancement:**
|
||||
|
||||
- ✅ **AI generates tests** (fast, based on requirements) **+** **MCP verifies tests** (accurate, based on reality)
|
||||
- ✅ **Accurate selectors**: Validated against actual DOM, not just inferred
|
||||
- ✅ **Visual validation**: TEA sees what user sees (modals, animations, state changes)
|
||||
- ✅ **Complex flows**: Records multi-step interactions precisely
|
||||
- ✅ **Edge case discovery**: Observes actual app behavior beyond requirements
|
||||
- ✅ **Selector resilience**: MCP generates robust locators from live page (role-based, text-based, fallback chains)
|
||||
|
||||
**Example Enhancement Flow:**
|
||||
|
||||
```
|
||||
1. AI generates test based on acceptance criteria
|
||||
→ await page.click('[data-testid="submit-button"]')
|
||||
|
||||
2. MCP verifies selector exists (browser_generate_locator)
|
||||
→ Found: button[type="submit"].btn-primary
|
||||
→ No data-testid attribute exists!
|
||||
|
||||
3. TEA refines test with actual selector
|
||||
→ await page.locator('button[type="submit"]').click()
|
||||
→ Documents requirement: "Add data-testid='submit-button' to button"
|
||||
```
|
||||
|
||||
**Recording Workflow (MCP-Based):**
|
||||
|
||||
```
|
||||
1. Set generation_mode: "recording"
|
||||
2. Use generator_setup_page to init recording session
|
||||
3. For each acceptance criterion:
|
||||
a. Execute scenario with browser_* tools:
|
||||
- browser_navigate, browser_click, browser_type
|
||||
- browser_select, browser_check
|
||||
b. Add verifications with browser_verify_* tools:
|
||||
- browser_verify_text, browser_verify_visible
|
||||
- browser_verify_url
|
||||
c. Capture log with generator_read_log
|
||||
d. Generate test with generator_write_test
|
||||
4. Enhance generated tests with knowledge base patterns:
|
||||
- Add Given-When-Then comments
|
||||
- Replace selectors with data-testid
|
||||
- Add network-first interception
|
||||
- Add fixtures/factories
|
||||
5. Verify tests fail (RED phase)
|
||||
```
|
||||
|
||||
**Example: Recording a Checkout Flow**
|
||||
|
||||
```markdown
|
||||
Recording session for: "User completes checkout with credit card"
|
||||
|
||||
Actions recorded:
|
||||
|
||||
1. browser_navigate('/cart')
|
||||
2. browser_click('[data-testid="checkout-button"]')
|
||||
3. browser_type('[data-testid="card-number"]', '4242424242424242')
|
||||
4. browser_type('[data-testid="expiry"]', '12/25')
|
||||
5. browser_type('[data-testid="cvv"]', '123')
|
||||
6. browser_click('[data-testid="place-order"]')
|
||||
7. browser_verify_text('Order confirmed')
|
||||
8. browser_verify_url('/confirmation')
|
||||
|
||||
Generated test (enhanced):
|
||||
|
||||
- Given-When-Then structure added
|
||||
- data-testid selectors used
|
||||
- Network-first payment API mock added
|
||||
- Card factory created for test data
|
||||
- Test verified to FAIL (checkout not implemented)
|
||||
```
|
||||
|
||||
**Graceful Degradation:**
|
||||
|
||||
- Recording mode is OPTIONAL (default: AI generation)
|
||||
- Requires Playwright MCP (falls back to AI if unavailable)
|
||||
- Generated tests enhanced with knowledge base patterns
|
||||
- Same quality output regardless of generation method
|
||||
|
||||
### Given-When-Then Structure
|
||||
|
||||
All tests follow BDD format for clarity:
|
||||
|
||||
```typescript
|
||||
test('should display error for invalid credentials', async ({ page }) => {
|
||||
// GIVEN: User is on login page
|
||||
await page.goto('/login');
|
||||
|
||||
// WHEN: User submits invalid credentials
|
||||
await page.fill('[data-testid="email-input"]', 'invalid@example.com');
|
||||
await page.fill('[data-testid="password-input"]', 'wrongpassword');
|
||||
await page.click('[data-testid="login-button"]');
|
||||
|
||||
// THEN: Error message is displayed
|
||||
await expect(page.locator('[data-testid="error-message"]')).toHaveText('Invalid email or password');
|
||||
});
|
||||
```
|
||||
|
||||
### Network-First Testing Pattern
|
||||
|
||||
**Critical pattern to prevent race conditions**:
|
||||
|
||||
```typescript
|
||||
// ✅ CORRECT: Intercept BEFORE navigation
|
||||
await page.route('**/api/data', handler);
|
||||
await page.goto('/page');
|
||||
|
||||
// ❌ WRONG: Navigate then intercept (race condition)
|
||||
await page.goto('/page');
|
||||
await page.route('**/api/data', handler); // Too late!
|
||||
```
|
||||
|
||||
Always set up route interception before navigating to pages that make network requests.
|
||||
|
||||
### Data Factory Architecture
|
||||
|
||||
Use faker for all test data generation:
|
||||
|
||||
```typescript
|
||||
// tests/support/factories/user.factory.ts
|
||||
import { faker } from '@faker-js/faker';
|
||||
|
||||
export const createUser = (overrides = {}) => ({
|
||||
id: faker.number.int(),
|
||||
email: faker.internet.email(),
|
||||
name: faker.person.fullName(),
|
||||
createdAt: faker.date.recent().toISOString(),
|
||||
...overrides,
|
||||
});
|
||||
|
||||
export const createUsers = (count: number) => Array.from({ length: count }, () => createUser());
|
||||
```
|
||||
|
||||
**Factory principles:**
|
||||
|
||||
- Use faker for random data (no hardcoded values to prevent collisions)
|
||||
- Support overrides for specific test scenarios
|
||||
- Generate complete valid objects matching API contracts
|
||||
- Include helper functions for bulk creation
|
||||
|
||||
### Fixture Architecture with Auto-Cleanup
|
||||
|
||||
Playwright fixtures with automatic data cleanup:
|
||||
|
||||
```typescript
|
||||
// tests/support/fixtures/auth.fixture.ts
|
||||
import { test as base } from '@playwright/test';
|
||||
|
||||
export const test = base.extend({
|
||||
authenticatedUser: async ({ page }, use) => {
|
||||
// Setup: Create and authenticate user
|
||||
const user = await createUser();
|
||||
await page.goto('/login');
|
||||
await page.fill('[data-testid="email"]', user.email);
|
||||
await page.fill('[data-testid="password"]', 'password123');
|
||||
await page.click('[data-testid="login-button"]');
|
||||
await page.waitForURL('/dashboard');
|
||||
|
||||
// Provide to test
|
||||
await use(user);
|
||||
|
||||
// Cleanup: Delete user (automatic)
|
||||
await deleteUser(user.id);
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
**Fixture principles:**
|
||||
|
||||
- Auto-cleanup (always delete created data in teardown)
|
||||
- Composable (fixtures can use other fixtures via mergeTests)
|
||||
- Isolated (each test gets fresh data)
|
||||
- Type-safe with TypeScript
|
||||
|
||||
### One Assertion Per Test (Atomic Design)
|
||||
|
||||
Each test should verify exactly one behavior:
|
||||
|
||||
```typescript
|
||||
// ✅ CORRECT: One assertion
|
||||
test('should display user name', async ({ page }) => {
|
||||
await expect(page.locator('[data-testid="user-name"]')).toHaveText('John');
|
||||
});
|
||||
|
||||
// ❌ WRONG: Multiple assertions (not atomic)
|
||||
test('should display user info', async ({ page }) => {
|
||||
await expect(page.locator('[data-testid="user-name"]')).toHaveText('John');
|
||||
await expect(page.locator('[data-testid="user-email"]')).toHaveText('john@example.com');
|
||||
});
|
||||
```
|
||||
|
||||
**Why?** If second assertion fails, you don't know if first is still valid. Split into separate tests for clear failure diagnosis.
|
||||
|
||||
### Implementation Checklist for DEV
|
||||
|
||||
Maps each failing test to concrete implementation tasks:
|
||||
|
||||
```markdown
|
||||
## Implementation Checklist
|
||||
|
||||
### Test: User Login with Valid Credentials
|
||||
|
||||
- [ ] Create `/login` route
|
||||
- [ ] Implement login form component
|
||||
- [ ] Add email/password validation
|
||||
- [ ] Integrate authentication API
|
||||
- [ ] Add `data-testid` attributes: `email-input`, `password-input`, `login-button`
|
||||
- [ ] Implement error handling
|
||||
- [ ] Run test: `npm run test:e2e -- login.spec.ts`
|
||||
- [ ] ✅ Test passes (green phase)
|
||||
```
|
||||
|
||||
Provides clear path from red to green for each test.
|
||||
|
||||
## Integration with Other Workflows
|
||||
|
||||
**Before this workflow:**
|
||||
|
||||
- **framework** workflow: Must run first to establish test framework architecture (Playwright or Cypress config, directory structure, base fixtures)
|
||||
- **test-design** workflow: Optional but recommended for P0-P3 priority alignment and risk assessment context
|
||||
|
||||
**After this workflow:**
|
||||
|
||||
- **DEV agent** implements features guided by failing tests and implementation checklist
|
||||
- **test-review** workflow: Review generated test quality before sharing with DEV team
|
||||
- **automate** workflow: After story completion, expand regression suite with additional edge case coverage
|
||||
|
||||
**Coordinates with:**
|
||||
|
||||
- **Story approval process**: ATDD runs after story is approved but before DEV begins implementation
|
||||
- **Quality gates**: Failing tests serve as acceptance criteria for story completion (all tests must pass)
|
||||
|
||||
## Important Notes
|
||||
|
||||
### ATDD is Test-First, Not Test-After
|
||||
|
||||
**Critical timing**: Tests must be written BEFORE any implementation code. This ensures:
|
||||
|
||||
- Tests define the contract (what needs to be built)
|
||||
- Implementation is guided by tests (no over-engineering)
|
||||
- Tests verify behavior, not implementation details
|
||||
- Confidence in refactoring (tests catch regressions)
|
||||
|
||||
### All Tests Must Fail Initially
|
||||
|
||||
**Red phase verification is mandatory**:
|
||||
|
||||
- Run tests locally after creation to confirm RED phase
|
||||
- Failure should be due to missing implementation, not test bugs
|
||||
- Failure messages should be clear and actionable
|
||||
- Document expected failure messages in ATDD checklist
|
||||
|
||||
If a test passes before implementation, it's not testing the right thing.
|
||||
|
||||
### Use data-testid for Stable Selectors
|
||||
|
||||
**Why data-testid?**
|
||||
|
||||
- CSS classes change frequently (styling refactors)
|
||||
- IDs may not be unique or stable
|
||||
- Text content changes with localization
|
||||
- data-testid is explicit contract between tests and UI
|
||||
|
||||
```typescript
|
||||
// ✅ CORRECT: Stable selector
|
||||
await page.click('[data-testid="login-button"]');
|
||||
|
||||
// ❌ FRAGILE: Class-based selector
|
||||
await page.click('.btn.btn-primary.login-btn');
|
||||
```
|
||||
|
||||
ATDD checklist includes complete list of required data-testid attributes for DEV team.
|
||||
|
||||
### No Hard Waits or Sleeps
|
||||
|
||||
**Use explicit waits only**:
|
||||
|
||||
```typescript
|
||||
// ✅ CORRECT: Explicit wait for condition
|
||||
await page.waitForSelector('[data-testid="user-name"]');
|
||||
await expect(page.locator('[data-testid="user-name"]')).toBeVisible();
|
||||
|
||||
// ❌ WRONG: Hard wait (flaky, slow)
|
||||
await page.waitForTimeout(2000);
|
||||
```
|
||||
|
||||
Playwright's auto-waiting is preferred (expect() automatically waits up to timeout).
|
||||
|
||||
### Component Tests for Complex UI Only
|
||||
|
||||
**When to use component tests:**
|
||||
|
||||
- Complex UI interactions (drag-drop, keyboard navigation)
|
||||
- Form validation logic
|
||||
- State management within component
|
||||
- Visual edge cases
|
||||
|
||||
**When NOT to use:**
|
||||
|
||||
- Simple rendering (snapshot tests are sufficient)
|
||||
- Integration with backend (use E2E or API tests)
|
||||
- Full user journeys (use E2E tests)
|
||||
|
||||
Component tests are valuable but should complement, not replace, E2E and API tests.
|
||||
|
||||
### Auto-Cleanup is Non-Negotiable
|
||||
|
||||
**Every test must clean up its data**:
|
||||
|
||||
- Use fixtures with automatic teardown
|
||||
- Never leave test data in database/storage
|
||||
- Each test should be isolated (no shared state)
|
||||
|
||||
**Cleanup patterns:**
|
||||
|
||||
- Fixtures: Cleanup in teardown function
|
||||
- Factories: Provide deletion helpers
|
||||
- Tests: Use `test.afterEach()` for manual cleanup if needed
|
||||
|
||||
Without auto-cleanup, tests become flaky and depend on execution order.
|
||||
|
||||
## Knowledge Base References
|
||||
|
||||
This workflow automatically consults:
|
||||
|
||||
- **fixture-architecture.md** - Test fixture patterns with setup/teardown and auto-cleanup using Playwright's test.extend()
|
||||
- **data-factories.md** - Factory patterns using @faker-js/faker for random test data generation with overrides support
|
||||
- **component-tdd.md** - Component test strategies using Playwright Component Testing (@playwright/experimental-ct-react)
|
||||
- **network-first.md** - Route interception patterns (intercept before navigation to prevent race conditions)
|
||||
- **test-quality.md** - Test design principles (Given-When-Then, one assertion per test, determinism, isolation)
|
||||
- **test-levels-framework.md** - Test level selection framework (E2E vs API vs Component vs Unit)
|
||||
|
||||
See `tea-index.csv` for complete knowledge fragment mapping and additional references.
|
||||
|
||||
## Example Output
|
||||
|
||||
After running this workflow, the ATDD checklist will contain:
|
||||
|
||||
````markdown
|
||||
# ATDD Checklist - Epic 3, Story 5: User Authentication
|
||||
|
||||
## Story Summary
|
||||
|
||||
As a user, I want to log in with email and password so that I can access my personalized dashboard.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
1. User can log in with valid credentials
|
||||
2. User sees error message with invalid credentials
|
||||
3. User is redirected to dashboard after successful login
|
||||
|
||||
## Failing Tests Created (RED Phase)
|
||||
|
||||
### E2E Tests (3 tests)
|
||||
|
||||
- `tests/e2e/user-authentication.spec.ts` (87 lines)
|
||||
- ✅ should log in with valid credentials (RED - missing /login route)
|
||||
- ✅ should display error for invalid credentials (RED - error message not implemented)
|
||||
- ✅ should redirect to dashboard after login (RED - redirect logic missing)
|
||||
|
||||
### API Tests (2 tests)
|
||||
|
||||
- `tests/api/auth.api.spec.ts` (54 lines)
|
||||
- ✅ POST /api/auth/login - should return token for valid credentials (RED - endpoint not implemented)
|
||||
- ✅ POST /api/auth/login - should return 401 for invalid credentials (RED - validation missing)
|
||||
|
||||
## Data Factories Created
|
||||
|
||||
- `tests/support/factories/user.factory.ts` - createUser(), createUsers(count)
|
||||
|
||||
## Fixtures Created
|
||||
|
||||
- `tests/support/fixtures/auth.fixture.ts` - authenticatedUser fixture with auto-cleanup
|
||||
|
||||
## Required data-testid Attributes
|
||||
|
||||
### Login Page
|
||||
|
||||
- `email-input` - Email input field
|
||||
- `password-input` - Password input field
|
||||
- `login-button` - Submit button
|
||||
- `error-message` - Error message container
|
||||
|
||||
### Dashboard Page
|
||||
|
||||
- `user-name` - User name display
|
||||
- `logout-button` - Logout button
|
||||
|
||||
## Implementation Checklist
|
||||
|
||||
### Test: User Login with Valid Credentials
|
||||
|
||||
- [ ] Create `/login` route
|
||||
- [ ] Implement login form component
|
||||
- [ ] Add email/password validation
|
||||
- [ ] Integrate authentication API
|
||||
- [ ] Add data-testid attributes: `email-input`, `password-input`, `login-button`
|
||||
- [ ] Run test: `npm run test:e2e -- user-authentication.spec.ts`
|
||||
- [ ] ✅ Test passes (green phase)
|
||||
|
||||
### Test: Display Error for Invalid Credentials
|
||||
|
||||
- [ ] Add error state management
|
||||
- [ ] Display error message UI
|
||||
- [ ] Add `data-testid="error-message"`
|
||||
- [ ] Run test: `npm run test:e2e -- user-authentication.spec.ts`
|
||||
- [ ] ✅ Test passes (green phase)
|
||||
|
||||
### Test: Redirect to Dashboard After Login
|
||||
|
||||
- [ ] Implement redirect logic after successful auth
|
||||
- [ ] Verify authentication token stored
|
||||
- [ ] Add dashboard route protection
|
||||
- [ ] Run test: `npm run test:e2e -- user-authentication.spec.ts`
|
||||
- [ ] ✅ Test passes (green phase)
|
||||
|
||||
## Running Tests
|
||||
|
||||
```bash
|
||||
# Run all failing tests
|
||||
npm run test:e2e
|
||||
|
||||
# Run specific test file
|
||||
npm run test:e2e -- user-authentication.spec.ts
|
||||
|
||||
# Run tests in headed mode (see browser)
|
||||
npm run test:e2e -- --headed
|
||||
|
||||
# Debug specific test
|
||||
npm run test:e2e -- user-authentication.spec.ts --debug
|
||||
```
|
||||
````
|
||||
|
||||
## Red-Green-Refactor Workflow
|
||||
|
||||
**RED Phase** (Complete):
|
||||
|
||||
- ✅ All tests written and failing
|
||||
- ✅ Fixtures and factories created
|
||||
- ✅ data-testid requirements documented
|
||||
|
||||
**GREEN Phase** (DEV Team - Next Steps):
|
||||
|
||||
1. Pick one failing test from checklist
|
||||
2. Implement minimal code to make it pass
|
||||
3. Run test to verify green
|
||||
4. Check off task in checklist
|
||||
5. Move to next test
|
||||
6. Repeat until all tests pass
|
||||
|
||||
**REFACTOR Phase** (DEV Team - After All Tests Pass):
|
||||
|
||||
1. All tests passing (green)
|
||||
2. Improve code quality (extract functions, optimize)
|
||||
3. Remove duplications
|
||||
4. Ensure tests still pass after each refactor
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. Review this checklist with team
|
||||
2. Run failing tests to confirm RED phase: `npm run test:e2e`
|
||||
3. Begin implementation using checklist as guide
|
||||
4. Share progress in daily standup
|
||||
5. When all tests pass, run `bmad sm story-done` to move story to DONE
|
||||
|
||||
```
|
||||
|
||||
This comprehensive checklist guides DEV team from red to green with clear tasks and validation steps.
|
||||
```
|
||||
@@ -1,869 +0,0 @@
|
||||
# Automate Workflow
|
||||
|
||||
Expands test automation coverage by generating comprehensive test suites at appropriate levels (E2E, API, Component, Unit) with supporting infrastructure. This workflow operates in **dual mode** - works seamlessly WITH or WITHOUT BMad artifacts.
|
||||
|
||||
**Core Principle**: Generate prioritized, deterministic tests that avoid duplicate coverage and follow testing best practices.
|
||||
|
||||
## Usage
|
||||
|
||||
```bash
|
||||
bmad tea *automate
|
||||
```
|
||||
|
||||
The TEA agent runs this workflow when:
|
||||
|
||||
- **BMad-Integrated**: After story implementation to expand coverage beyond ATDD tests
|
||||
- **Standalone**: Point at any codebase/feature and generate tests independently ("work out of thin air")
|
||||
- **Auto-discover**: No targets specified - scans codebase for features needing tests
|
||||
|
||||
## Inputs
|
||||
|
||||
**Execution Modes:**
|
||||
|
||||
1. **BMad-Integrated Mode** (story available) - OPTIONAL
|
||||
2. **Standalone Mode** (no BMad artifacts) - Direct code analysis
|
||||
3. **Auto-discover Mode** (no targets) - Scan for coverage gaps
|
||||
|
||||
**Required Context Files:**
|
||||
|
||||
- **Framework configuration**: Test framework config (playwright.config.ts or cypress.config.ts) - REQUIRED
|
||||
|
||||
**Optional Context (BMad-Integrated Mode):**
|
||||
|
||||
- **Story markdown** (`{story_file}`): User story with acceptance criteria (enhances coverage targeting but NOT required)
|
||||
- **Tech spec**: Technical specification (provides architectural context)
|
||||
- **Test design**: Risk/priority context (P0-P3 alignment)
|
||||
- **PRD**: Product requirements (business context)
|
||||
|
||||
**Optional Context (Standalone Mode):**
|
||||
|
||||
- **Source code**: Feature implementation to analyze
|
||||
- **Existing tests**: Current test suite for gap analysis
|
||||
|
||||
**Workflow Variables:**
|
||||
|
||||
- `standalone_mode`: Can work without BMad artifacts (default: true)
|
||||
- `story_file`: Path to story markdown (optional)
|
||||
- `target_feature`: Feature name or directory to analyze (e.g., "user-authentication" or "src/auth/")
|
||||
- `target_files`: Specific files to analyze (comma-separated paths)
|
||||
- `test_dir`: Directory for test files (default: `{project-root}/tests`)
|
||||
- `source_dir`: Source code directory (default: `{project-root}/src`)
|
||||
- `auto_discover_features`: Automatically find features needing tests (default: true)
|
||||
- `analyze_coverage`: Check existing test coverage gaps (default: true)
|
||||
- `coverage_target`: Coverage strategy - "critical-paths", "comprehensive", "selective" (default: "critical-paths")
|
||||
- `test_levels`: Which levels to generate - "e2e,api,component,unit" (default: all)
|
||||
- `avoid_duplicate_coverage`: Don't test same behavior at multiple levels (default: true)
|
||||
- `include_p0`: Include P0 critical path tests (default: true)
|
||||
- `include_p1`: Include P1 high priority tests (default: true)
|
||||
- `include_p2`: Include P2 medium priority tests (default: true)
|
||||
- `include_p3`: Include P3 low priority tests (default: false)
|
||||
- `use_given_when_then`: BDD-style test structure (default: true)
|
||||
- `one_assertion_per_test`: Atomic test design (default: true)
|
||||
- `network_first`: Route interception before navigation (default: true)
|
||||
- `deterministic_waits`: No hard waits or sleeps (default: true)
|
||||
- `generate_fixtures`: Create/enhance fixture architecture (default: true)
|
||||
- `generate_factories`: Create/enhance data factories (default: true)
|
||||
- `update_helpers`: Add utility functions (default: true)
|
||||
- `use_test_design`: Load test-design.md if exists (default: true)
|
||||
- `use_tech_spec`: Load tech-spec.md if exists (default: true)
|
||||
- `use_prd`: Load PRD.md if exists (default: true)
|
||||
- `update_readme`: Update test README with new specs (default: true)
|
||||
- `update_package_scripts`: Add test execution scripts (default: true)
|
||||
- `output_summary`: Path for automation summary (default: `{output_folder}/automation-summary.md`)
|
||||
- `max_test_duration`: Maximum seconds per test (default: 90)
|
||||
- `max_file_lines`: Maximum lines per test file (default: 300)
|
||||
- `require_self_cleaning`: All tests must clean up data (default: true)
|
||||
- `auto_load_knowledge`: Load relevant knowledge fragments (default: true)
|
||||
- `run_tests_after_generation`: Verify tests pass/fail as expected (default: true)
|
||||
- `auto_validate`: Run generated tests after creation (default: true) **NEW**
|
||||
- `auto_heal_failures`: Enable automatic healing (default: false, opt-in) **NEW**
|
||||
- `max_healing_iterations`: Maximum healing attempts per test (default: 3) **NEW**
|
||||
- `fail_on_unhealable`: Fail workflow if tests can't be healed (default: false) **NEW**
|
||||
- `mark_unhealable_as_fixme`: Mark unfixable tests with test.fixme() (default: true) **NEW**
|
||||
- `use_mcp_healing`: Use Playwright MCP if available (default: true) **NEW**
|
||||
- `healing_knowledge_fragments`: Healing patterns to load (default: "test-healing-patterns,selector-resilience,timing-debugging") **NEW**
|
||||
|
||||
## Outputs
|
||||
|
||||
**Primary Deliverable:**
|
||||
|
||||
- **Automation Summary** (`automation-summary.md`): Comprehensive report containing:
|
||||
- Execution mode (BMad-Integrated, Standalone, Auto-discover)
|
||||
- Feature analysis (source files analyzed, coverage gaps)
|
||||
- Tests created (E2E, API, Component, Unit) with counts and paths
|
||||
- Infrastructure created (fixtures, factories, helpers)
|
||||
- Test execution instructions
|
||||
- Coverage analysis (P0-P3 breakdown, coverage percentage)
|
||||
- Definition of Done checklist
|
||||
- Next steps and recommendations
|
||||
|
||||
**Test Files Created:**
|
||||
|
||||
- **E2E tests** (`tests/e2e/{feature-name}.spec.ts`): Critical user journeys (P0-P1)
|
||||
- **API tests** (`tests/api/{feature-name}.api.spec.ts`): Business logic and contracts (P1-P2)
|
||||
- **Component tests** (`tests/component/{ComponentName}.test.tsx`): UI behavior (P1-P2)
|
||||
- **Unit tests** (`tests/unit/{module-name}.test.ts`): Pure logic (P2-P3)
|
||||
|
||||
**Supporting Infrastructure:**
|
||||
|
||||
- **Fixtures** (`tests/support/fixtures/{feature}.fixture.ts`): Setup/teardown with auto-cleanup
|
||||
- **Data factories** (`tests/support/factories/{entity}.factory.ts`): Random test data using faker
|
||||
- **Helpers** (`tests/support/helpers/{utility}.ts`): Utility functions (waitFor, retry, etc.)
|
||||
|
||||
**Documentation Updates:**
|
||||
|
||||
- **Test README** (`tests/README.md`): Test suite overview, execution instructions, priority tagging, patterns
|
||||
- **package.json scripts**: Test execution commands (test:e2e, test:e2e:p0, test:api, etc.)
|
||||
|
||||
**Validation Safeguards:**
|
||||
|
||||
- All tests follow Given-When-Then format
|
||||
- All tests have priority tags ([P0], [P1], [P2], [P3])
|
||||
- All tests use data-testid selectors (stable, not CSS classes)
|
||||
- All tests are self-cleaning (fixtures with auto-cleanup)
|
||||
- No hard waits or flaky patterns (deterministic)
|
||||
- Test files under 300 lines (lean and focused)
|
||||
- Tests run under 1.5 minutes each (fast feedback)
|
||||
|
||||
## Key Features
|
||||
|
||||
### Dual-Mode Operation
|
||||
|
||||
**BMad-Integrated Mode** (story available):
|
||||
|
||||
- Uses story acceptance criteria for coverage targeting
|
||||
- Aligns with test-design risk/priority assessment
|
||||
- Expands ATDD tests with edge cases and negative paths
|
||||
- Optional - story enhances coverage but not required
|
||||
|
||||
**Standalone Mode** (no story):
|
||||
|
||||
- Analyzes source code independently
|
||||
- Identifies coverage gaps automatically
|
||||
- Generates tests based on code analysis
|
||||
- Works with any project (BMad or non-BMad)
|
||||
|
||||
**Auto-discover Mode** (no targets):
|
||||
|
||||
- Scans codebase for features needing tests
|
||||
- Prioritizes features with no coverage
|
||||
- Generates comprehensive test plan
|
||||
|
||||
### Avoid Duplicate Coverage
|
||||
|
||||
**Critical principle**: Don't test same behavior at multiple levels
|
||||
|
||||
**Good coverage strategy:**
|
||||
|
||||
- **E2E**: User can login → Dashboard loads (critical happy path only)
|
||||
- **API**: POST /auth/login returns correct status codes (variations: 200, 401, 400)
|
||||
- **Component**: LoginForm validates input (UI edge cases: empty fields, invalid format)
|
||||
- **Unit**: validateEmail() logic (pure function edge cases)
|
||||
|
||||
**Bad coverage (duplicate):**
|
||||
|
||||
- E2E: User can login → Dashboard loads
|
||||
- E2E: User can login with different emails → Dashboard loads (unnecessary duplication)
|
||||
- API: POST /auth/login returns 200 (already covered in E2E)
|
||||
|
||||
Use E2E sparingly for critical paths. Use API/Component/Unit for variations and edge cases.
|
||||
|
||||
### Healing Capabilities (NEW - Phase 2.5)
|
||||
|
||||
**automate** automatically validates and heals test failures after generation.
|
||||
|
||||
**Configuration**: Controlled by `config.tea_use_mcp_enhancements` (default: true)
|
||||
|
||||
- If true + MCP available → MCP-assisted healing
|
||||
- If true + MCP unavailable → Pattern-based healing
|
||||
- If false → No healing, document failures for manual review
|
||||
|
||||
**Constants**: Max 3 healing attempts, unfixable tests marked as `test.fixme()`
|
||||
|
||||
**How Healing Works (Default - Pattern-Based):**
|
||||
|
||||
TEA heals tests using pattern-based analysis by:
|
||||
|
||||
1. **Parsing error messages** from test output logs
|
||||
2. **Matching patterns** against known failure signatures
|
||||
3. **Applying fixes** from healing knowledge fragments:
|
||||
- `test-healing-patterns.md` - Common failure patterns (selectors, timing, data, network)
|
||||
- `selector-resilience.md` - Selector refactoring (CSS → data-testid, nth() → filter())
|
||||
- `timing-debugging.md` - Race condition fixes (hard waits → event-based waits)
|
||||
4. **Re-running tests** to verify fix (max 3 iterations)
|
||||
5. **Marking unfixable tests** as `test.fixme()` with detailed comments
|
||||
|
||||
**This works well for:**
|
||||
|
||||
- ✅ Common failure patterns (stale selectors, timing issues, dynamic data)
|
||||
- ✅ Text-based errors with clear signatures
|
||||
- ✅ Issues documented in knowledge base
|
||||
- ✅ Automated CI environments without browser access
|
||||
|
||||
**What MCP Adds (Interactive Debugging Enhancement):**
|
||||
|
||||
When Playwright MCP is available, TEA **additionally**:
|
||||
|
||||
1. **Debugs failures interactively** before applying pattern-based fixes:
|
||||
- **Pause test execution** with `playwright_test_debug_test` (step through, inspect state)
|
||||
- **See visual failure context** with `browser_snapshot` (screenshot of failure state)
|
||||
- **Inspect live DOM** with browser tools (find why selector doesn't match)
|
||||
- **Analyze console logs** with `browser_console_messages` (JS errors, warnings, debug output)
|
||||
- **Inspect network activity** with `browser_network_requests` (failed API calls, CORS errors, timeouts)
|
||||
|
||||
2. **Enhances pattern-based fixes** with real-world data:
|
||||
- **Pattern match identifies issue** (e.g., "stale selector")
|
||||
- **MCP discovers actual selector** with `browser_generate_locator` from live page
|
||||
- **TEA applies refined fix** using real DOM structure (not just pattern guess)
|
||||
- **Verification happens in browser** (see if fix works visually)
|
||||
|
||||
3. **Catches root causes** pattern matching might miss:
|
||||
- **Network failures**: MCP shows 500 error on API call (not just timeout)
|
||||
- **JS errors**: MCP shows `TypeError: undefined` in console (not just "element not found")
|
||||
- **Timing issues**: MCP shows loading spinner still visible (not just "selector timeout")
|
||||
- **State problems**: MCP shows modal blocking button (not just "not clickable")
|
||||
|
||||
**Key Benefits of MCP Enhancement:**
|
||||
|
||||
- ✅ **Pattern-based fixes** (fast, automated) **+** **MCP verification** (accurate, context-aware)
|
||||
- ✅ **Visual debugging**: See exactly what user sees when test fails
|
||||
- ✅ **DOM inspection**: Discover why selectors don't match (element missing, wrong attributes, dynamic IDs)
|
||||
- ✅ **Network visibility**: Identify API failures, slow requests, CORS issues
|
||||
- ✅ **Console analysis**: Catch JS errors that break page functionality
|
||||
- ✅ **Robust selectors**: Generate locators from actual DOM (role, text, testid hierarchy)
|
||||
- ✅ **Faster iteration**: Debug and fix in same browser session (no restart needed)
|
||||
- ✅ **Higher success rate**: MCP helps diagnose failures pattern matching can't solve
|
||||
|
||||
**Example Enhancement Flow:**
|
||||
|
||||
```
|
||||
1. Pattern-based healing identifies issue
|
||||
→ Error: "Locator '.submit-btn' resolved to 0 elements"
|
||||
→ Pattern match: Stale selector (CSS class)
|
||||
→ Suggested fix: Replace with data-testid
|
||||
|
||||
2. MCP enhances diagnosis (if available)
|
||||
→ browser_snapshot shows button exists but has class ".submit-button" (not ".submit-btn")
|
||||
→ browser_generate_locator finds: button[type="submit"].submit-button
|
||||
→ browser_console_messages shows no errors
|
||||
|
||||
3. TEA applies refined fix
|
||||
→ await page.locator('button[type="submit"]').click()
|
||||
→ (More accurate than pattern-based guess)
|
||||
```
|
||||
|
||||
**Healing Modes:**
|
||||
|
||||
1. **MCP-Enhanced Healing** (when Playwright MCP available):
|
||||
- Pattern-based analysis **+** Interactive debugging
|
||||
- Visual context with `browser_snapshot`
|
||||
- Console log analysis with `browser_console_messages`
|
||||
- Network inspection with `browser_network_requests`
|
||||
- Live DOM inspection with `browser_generate_locator`
|
||||
- Step-by-step debugging with `playwright_test_debug_test`
|
||||
|
||||
2. **Pattern-Based Healing** (always available):
|
||||
- Error message parsing and pattern matching
|
||||
- Automated fixes from healing knowledge fragments
|
||||
- Text-based analysis (no visual/DOM inspection)
|
||||
- Works in CI without browser access
|
||||
|
||||
**Healing Workflow:**
|
||||
|
||||
```
|
||||
1. Generate tests → Run tests
|
||||
2. IF pass → Success ✅
|
||||
3. IF fail AND auto_heal_failures=false → Report failures ⚠️
|
||||
4. IF fail AND auto_heal_failures=true → Enter healing loop:
|
||||
a. Identify failure pattern (selector, timing, data, network)
|
||||
b. Apply automated fix from knowledge base
|
||||
c. Re-run test (max 3 iterations)
|
||||
d. IF healed → Success ✅
|
||||
e. IF unhealable → Mark test.fixme() with detailed comment
|
||||
```
|
||||
|
||||
**Example Healing Outcomes:**
|
||||
|
||||
```typescript
|
||||
// ❌ Original (failing): CSS class selector
|
||||
await page.locator('.btn-primary').click();
|
||||
|
||||
// ✅ Healed: data-testid selector
|
||||
await page.getByTestId('submit-button').click();
|
||||
|
||||
// ❌ Original (failing): Hard wait
|
||||
await page.waitForTimeout(3000);
|
||||
|
||||
// ✅ Healed: Network-first pattern
|
||||
await page.waitForResponse('**/api/data');
|
||||
|
||||
// ❌ Original (failing): Hardcoded ID
|
||||
await expect(page.getByText('User 123')).toBeVisible();
|
||||
|
||||
// ✅ Healed: Regex pattern
|
||||
await expect(page.getByText(/User \d+/)).toBeVisible();
|
||||
```
|
||||
|
||||
**Unfixable Tests (Marked as test.fixme()):**
|
||||
|
||||
```typescript
|
||||
test.fixme('[P1] should handle complex interaction', async ({ page }) => {
|
||||
// FIXME: Test healing failed after 3 attempts
|
||||
// Failure: "Locator 'button[data-action="submit"]' resolved to 0 elements"
|
||||
// Attempted fixes:
|
||||
// 1. Replaced with page.getByTestId('submit-button') - still failing
|
||||
// 2. Replaced with page.getByRole('button', { name: 'Submit' }) - still failing
|
||||
// 3. Added waitForLoadState('networkidle') - still failing
|
||||
// Manual investigation needed: Selector may require application code changes
|
||||
// TODO: Review with team, may need data-testid added to button component
|
||||
// Original test code...
|
||||
});
|
||||
```
|
||||
|
||||
**When to Enable Healing:**
|
||||
|
||||
- ✅ Enable for greenfield projects (catch generated test issues early)
|
||||
- ✅ Enable for brownfield projects (auto-fix legacy selector patterns)
|
||||
- ❌ Disable if environment not ready (application not deployed/seeded)
|
||||
- ❌ Disable if preferring manual review of all generated tests
|
||||
|
||||
**Healing Report Example:**
|
||||
|
||||
```markdown
|
||||
## Test Healing Report
|
||||
|
||||
**Auto-Heal Enabled**: true
|
||||
**Healing Mode**: Pattern-based
|
||||
**Iterations Allowed**: 3
|
||||
|
||||
### Validation Results
|
||||
|
||||
- **Total tests**: 10
|
||||
- **Passing**: 7
|
||||
- **Failing**: 3
|
||||
|
||||
### Healing Outcomes
|
||||
|
||||
**Successfully Healed (2 tests):**
|
||||
|
||||
- `tests/e2e/login.spec.ts:15` - Stale selector (CSS class → data-testid)
|
||||
- `tests/e2e/checkout.spec.ts:42` - Race condition (added network-first interception)
|
||||
|
||||
**Unable to Heal (1 test):**
|
||||
|
||||
- `tests/e2e/complex-flow.spec.ts:67` - Marked as test.fixme()
|
||||
- Requires application code changes (add data-testid to component)
|
||||
|
||||
### Healing Patterns Applied
|
||||
|
||||
- **Selector fixes**: 1
|
||||
- **Timing fixes**: 1
|
||||
```
|
||||
|
||||
**Graceful Degradation:**
|
||||
|
||||
- Healing is OPTIONAL (default: disabled)
|
||||
- Works without Playwright MCP (pattern-based fallback)
|
||||
- Unfixable tests marked clearly (not silently broken)
|
||||
- Manual investigation path documented
|
||||
|
||||
### Recording Mode (NEW - Phase 2.5)
|
||||
|
||||
**automate** can record complex UI interactions instead of AI generation.
|
||||
|
||||
**Activation**: Automatic for complex UI scenarios when config.tea_use_mcp_enhancements is true and MCP available
|
||||
|
||||
- Complex scenarios: drag-drop, wizards, multi-page flows
|
||||
- Fallback: AI generation (silent, automatic)
|
||||
|
||||
**When to Use Recording Mode:**
|
||||
|
||||
- ✅ Complex UI interactions (drag-drop, multi-step forms, wizards)
|
||||
- ✅ Visual workflows (modals, dialogs, animations, transitions)
|
||||
- ✅ Unclear requirements (exploratory, discovering behavior)
|
||||
- ✅ Multi-page flows (checkout, registration, onboarding)
|
||||
- ❌ NOT for simple CRUD (AI generation faster)
|
||||
- ❌ NOT for API-only tests (no UI to record)
|
||||
|
||||
**When to Use AI Generation (Default):**
|
||||
|
||||
- ✅ Clear requirements available
|
||||
- ✅ Standard patterns (login, CRUD, navigation)
|
||||
- ✅ Need many tests quickly
|
||||
- ✅ API/backend tests (no UI interaction)
|
||||
|
||||
**Recording Workflow (Same as atdd):**
|
||||
|
||||
```
|
||||
1. Set generation_mode: "recording"
|
||||
2. Use generator_setup_page to init recording
|
||||
3. For each test scenario:
|
||||
- Execute with browser_* tools (navigate, click, type, select)
|
||||
- Add verifications with browser_verify_* tools
|
||||
- Capture log and generate test file
|
||||
4. Enhance with knowledge base patterns:
|
||||
- Given-When-Then structure
|
||||
- data-testid selectors
|
||||
- Network-first interception
|
||||
- Fixtures/factories
|
||||
5. Validate (run tests if auto_validate enabled)
|
||||
6. Heal if needed (if auto_heal_failures enabled)
|
||||
```
|
||||
|
||||
**Combination: Recording + Healing:**
|
||||
|
||||
automate can use BOTH recording and healing together:
|
||||
|
||||
- Generate tests via recording (complex flows captured interactively)
|
||||
- Run tests to validate (auto_validate)
|
||||
- Heal failures automatically (auto_heal_failures)
|
||||
|
||||
This is particularly powerful for brownfield projects where:
|
||||
|
||||
- Requirements unclear → Use recording to capture existing behavior
|
||||
- Application complex → Recording captures nuances AI might miss
|
||||
- Tests may fail → Healing fixes common issues automatically
|
||||
|
||||
**Graceful Degradation:**
|
||||
|
||||
- Recording mode is OPTIONAL (default: AI generation)
|
||||
- Requires Playwright MCP (falls back to AI if unavailable)
|
||||
- Works with or without healing enabled
|
||||
- Same quality output regardless of generation method
|
||||
|
||||
### Test Level Selection Framework
|
||||
|
||||
**E2E (End-to-End)**:
|
||||
|
||||
- Critical user journeys (login, checkout, core workflows)
|
||||
- Multi-system integration
|
||||
- User-facing acceptance criteria
|
||||
- Characteristics: High confidence, slow execution, brittle
|
||||
|
||||
**API (Integration)**:
|
||||
|
||||
- Business logic validation
|
||||
- Service contracts and data transformations
|
||||
- Backend integration without UI
|
||||
- Characteristics: Fast feedback, good balance, stable
|
||||
|
||||
**Component**:
|
||||
|
||||
- UI component behavior (buttons, forms, modals)
|
||||
- Interaction testing (click, hover, keyboard navigation)
|
||||
- State management within component
|
||||
- Characteristics: Fast, isolated, granular
|
||||
|
||||
**Unit**:
|
||||
|
||||
- Pure business logic and algorithms
|
||||
- Edge cases and error handling
|
||||
- Minimal dependencies
|
||||
- Characteristics: Fastest, most granular
|
||||
|
||||
### Priority Classification (P0-P3)
|
||||
|
||||
**P0 (Critical - Every commit)**:
|
||||
|
||||
- Critical user paths that must always work
|
||||
- Security-critical functionality (auth, permissions)
|
||||
- Data integrity scenarios
|
||||
- Run in pre-commit hooks or PR checks
|
||||
|
||||
**P1 (High - PR to main)**:
|
||||
|
||||
- Important features with high user impact
|
||||
- Integration points between systems
|
||||
- Error handling for common failures
|
||||
- Run before merging to main branch
|
||||
|
||||
**P2 (Medium - Nightly)**:
|
||||
|
||||
- Edge cases with moderate impact
|
||||
- Less-critical feature variations
|
||||
- Performance/load testing
|
||||
- Run in nightly CI builds
|
||||
|
||||
**P3 (Low - On-demand)**:
|
||||
|
||||
- Nice-to-have validations
|
||||
- Rarely-used features
|
||||
- Exploratory testing scenarios
|
||||
- Run manually or weekly
|
||||
|
||||
**Priority tagging enables selective execution:**
|
||||
|
||||
```bash
|
||||
npm run test:e2e:p0 # Run only P0 tests (critical paths)
|
||||
npm run test:e2e:p1 # Run P0 + P1 tests (pre-merge)
|
||||
```
|
||||
|
||||
### Given-When-Then Test Structure
|
||||
|
||||
All tests follow BDD format for clarity:
|
||||
|
||||
```typescript
|
||||
test('[P0] should login with valid credentials and load dashboard', async ({ page }) => {
|
||||
// GIVEN: User is on login page
|
||||
await page.goto('/login');
|
||||
|
||||
// WHEN: User submits valid credentials
|
||||
await page.fill('[data-testid="email-input"]', 'user@example.com');
|
||||
await page.fill('[data-testid="password-input"]', 'Password123!');
|
||||
await page.click('[data-testid="login-button"]');
|
||||
|
||||
// THEN: User is redirected to dashboard
|
||||
await expect(page).toHaveURL('/dashboard');
|
||||
await expect(page.locator('[data-testid="user-name"]')).toBeVisible();
|
||||
});
|
||||
```
|
||||
|
||||
### One Assertion Per Test (Atomic Design)
|
||||
|
||||
Each test verifies exactly one behavior:
|
||||
|
||||
```typescript
|
||||
// ✅ CORRECT: One assertion
|
||||
test('[P0] should display user name', async ({ page }) => {
|
||||
await expect(page.locator('[data-testid="user-name"]')).toHaveText('John');
|
||||
});
|
||||
|
||||
// ❌ WRONG: Multiple assertions (not atomic)
|
||||
test('[P0] should display user info', async ({ page }) => {
|
||||
await expect(page.locator('[data-testid="user-name"]')).toHaveText('John');
|
||||
await expect(page.locator('[data-testid="user-email"]')).toHaveText('john@example.com');
|
||||
});
|
||||
```
|
||||
|
||||
**Why?** If second assertion fails, you don't know if first is still valid. Split into separate tests for clear failure diagnosis.
|
||||
|
||||
### Network-First Testing Pattern
|
||||
|
||||
**Critical pattern to prevent race conditions**:
|
||||
|
||||
```typescript
|
||||
test('should load user dashboard after login', async ({ page }) => {
|
||||
// CRITICAL: Intercept routes BEFORE navigation
|
||||
await page.route('**/api/user', (route) =>
|
||||
route.fulfill({
|
||||
status: 200,
|
||||
body: JSON.stringify({ id: 1, name: 'Test User' }),
|
||||
}),
|
||||
);
|
||||
|
||||
// NOW navigate
|
||||
await page.goto('/dashboard');
|
||||
|
||||
await expect(page.locator('[data-testid="user-name"]')).toHaveText('Test User');
|
||||
});
|
||||
```
|
||||
|
||||
Always set up route interception before navigating to pages that make network requests.
|
||||
|
||||
### Fixture Architecture with Auto-Cleanup
|
||||
|
||||
Playwright fixtures with automatic data cleanup:
|
||||
|
||||
```typescript
|
||||
// tests/support/fixtures/auth.fixture.ts
|
||||
import { test as base } from '@playwright/test';
|
||||
import { createUser, deleteUser } from '../factories/user.factory';
|
||||
|
||||
export const test = base.extend({
|
||||
authenticatedUser: async ({ page }, use) => {
|
||||
// Setup: Create and authenticate user
|
||||
const user = await createUser();
|
||||
await page.goto('/login');
|
||||
await page.fill('[data-testid="email"]', user.email);
|
||||
await page.fill('[data-testid="password"]', user.password);
|
||||
await page.click('[data-testid="login-button"]');
|
||||
await page.waitForURL('/dashboard');
|
||||
|
||||
// Provide to test
|
||||
await use(user);
|
||||
|
||||
// Cleanup: Delete user automatically
|
||||
await deleteUser(user.id);
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
**Fixture principles:**
|
||||
|
||||
- Auto-cleanup (always delete created data in teardown)
|
||||
- Composable (fixtures can use other fixtures)
|
||||
- Isolated (each test gets fresh data)
|
||||
- Type-safe with TypeScript
|
||||
|
||||
### Data Factory Architecture
|
||||
|
||||
Use faker for all test data generation:
|
||||
|
||||
```typescript
|
||||
// tests/support/factories/user.factory.ts
|
||||
import { faker } from '@faker-js/faker';
|
||||
|
||||
export const createUser = (overrides = {}) => ({
|
||||
id: faker.number.int(),
|
||||
email: faker.internet.email(),
|
||||
password: faker.internet.password(),
|
||||
name: faker.person.fullName(),
|
||||
role: 'user',
|
||||
createdAt: faker.date.recent().toISOString(),
|
||||
...overrides,
|
||||
});
|
||||
|
||||
export const createUsers = (count: number) => Array.from({ length: count }, () => createUser());
|
||||
|
||||
// API helper for cleanup
|
||||
export const deleteUser = async (userId: number) => {
|
||||
await fetch(`/api/users/${userId}`, { method: 'DELETE' });
|
||||
};
|
||||
```
|
||||
|
||||
**Factory principles:**
|
||||
|
||||
- Use faker for random data (no hardcoded values to prevent collisions)
|
||||
- Support overrides for specific test scenarios
|
||||
- Generate complete valid objects matching API contracts
|
||||
- Include helper functions for bulk creation and cleanup
|
||||
|
||||
### No Page Objects
|
||||
|
||||
**Do NOT create page object classes.** Keep tests simple and direct:
|
||||
|
||||
```typescript
|
||||
// ✅ CORRECT: Direct test
|
||||
test('should login', async ({ page }) => {
|
||||
await page.goto('/login');
|
||||
await page.fill('[data-testid="email"]', 'user@example.com');
|
||||
await page.click('[data-testid="login-button"]');
|
||||
await expect(page).toHaveURL('/dashboard');
|
||||
});
|
||||
|
||||
// ❌ WRONG: Page object abstraction
|
||||
class LoginPage {
|
||||
async login(email, password) { ... }
|
||||
}
|
||||
```
|
||||
|
||||
Use fixtures for setup/teardown, not page objects for actions.
|
||||
|
||||
### Deterministic Tests Only
|
||||
|
||||
**No flaky patterns allowed:**
|
||||
|
||||
```typescript
|
||||
// ❌ WRONG: Hard wait
|
||||
await page.waitForTimeout(2000);
|
||||
|
||||
// ✅ CORRECT: Explicit wait
|
||||
await page.waitForSelector('[data-testid="user-name"]');
|
||||
await expect(page.locator('[data-testid="user-name"]')).toBeVisible();
|
||||
|
||||
// ❌ WRONG: Conditional flow
|
||||
if (await element.isVisible()) {
|
||||
await element.click();
|
||||
}
|
||||
|
||||
// ✅ CORRECT: Deterministic assertion
|
||||
await expect(element).toBeVisible();
|
||||
await element.click();
|
||||
|
||||
// ❌ WRONG: Try-catch for test logic
|
||||
try {
|
||||
await element.click();
|
||||
} catch (e) {
|
||||
// Test shouldn't catch errors
|
||||
}
|
||||
|
||||
// ✅ CORRECT: Let test fail if element not found
|
||||
await element.click();
|
||||
```
|
||||
|
||||
## Integration with Other Workflows
|
||||
|
||||
**Before this workflow:**
|
||||
|
||||
- **framework** workflow: Establish test framework architecture (Playwright/Cypress config, directory structure) - REQUIRED
|
||||
- **test-design** workflow: Optional for P0-P3 priority alignment and risk assessment context (BMad-Integrated mode only)
|
||||
- **atdd** workflow: Optional - automate expands beyond ATDD tests with edge cases (BMad-Integrated mode only)
|
||||
|
||||
**After this workflow:**
|
||||
|
||||
- **trace** workflow: Update traceability matrix with new test coverage (Phase 1) and make quality gate decision (Phase 2)
|
||||
- **CI pipeline**: Run tests in burn-in loop to detect flaky patterns
|
||||
|
||||
**Coordinates with:**
|
||||
|
||||
- **DEV agent**: Tests validate implementation correctness
|
||||
- **Story workflow**: Tests cover acceptance criteria (BMad-Integrated mode only)
|
||||
|
||||
## Important Notes
|
||||
|
||||
### Works Out of Thin Air
|
||||
|
||||
**automate does NOT require BMad artifacts:**
|
||||
|
||||
- Can analyze any codebase independently
|
||||
- User can point TEA at a feature: "automate tests for src/auth/"
|
||||
- Works on non-BMad projects
|
||||
- BMad artifacts (story, tech-spec, PRD) are OPTIONAL enhancements, not requirements
|
||||
|
||||
**Similar to:**
|
||||
|
||||
- **framework**: Can scaffold tests on any project
|
||||
- **ci**: Can generate CI config without BMad context
|
||||
|
||||
**Different from:**
|
||||
|
||||
- **atdd**: REQUIRES story with acceptance criteria (halt if missing)
|
||||
- **test-design**: REQUIRES PRD/epic context (halt if missing)
|
||||
- **trace (Phase 2)**: REQUIRES test results for gate decision (halt if missing)
|
||||
|
||||
### File Size Limits
|
||||
|
||||
**Keep test files lean (under 300 lines):**
|
||||
|
||||
- If file exceeds limit, split into multiple files by feature area
|
||||
- Group related tests in describe blocks
|
||||
- Extract common setup to fixtures
|
||||
|
||||
### Quality Standards Enforced
|
||||
|
||||
**Every test must:**
|
||||
|
||||
- ✅ Use Given-When-Then format
|
||||
- ✅ Have clear, descriptive name with priority tag
|
||||
- ✅ One assertion per test (atomic)
|
||||
- ✅ No hard waits or sleeps
|
||||
- ✅ Use data-testid selectors (not CSS classes)
|
||||
- ✅ Self-cleaning (fixtures with auto-cleanup)
|
||||
- ✅ Deterministic (no flaky patterns)
|
||||
- ✅ Fast (under 90 seconds)
|
||||
|
||||
**Forbidden patterns:**
|
||||
|
||||
- ❌ Hard waits: `await page.waitForTimeout(2000)`
|
||||
- ❌ Conditional flow: `if (await element.isVisible()) { ... }`
|
||||
- ❌ Try-catch for test logic
|
||||
- ❌ Hardcoded test data (use factories with faker)
|
||||
- ❌ Page objects
|
||||
- ❌ Shared state between tests
|
||||
|
||||
## Knowledge Base References
|
||||
|
||||
This workflow automatically consults:
|
||||
|
||||
- **test-levels-framework.md** - Test level selection (E2E vs API vs Component vs Unit) with characteristics and use cases
|
||||
- **test-priorities.md** - Priority classification (P0-P3) with execution timing and risk alignment
|
||||
- **fixture-architecture.md** - Test fixture patterns with setup/teardown and auto-cleanup using Playwright's test.extend()
|
||||
- **data-factories.md** - Factory patterns using @faker-js/faker for random test data generation with overrides
|
||||
- **selective-testing.md** - Targeted test execution strategies for CI optimization
|
||||
- **ci-burn-in.md** - Flaky test detection patterns (10 iterations to catch intermittent failures)
|
||||
- **test-quality.md** - Test design principles (Given-When-Then, determinism, isolation, atomic assertions)
|
||||
|
||||
**Healing Knowledge (If `auto_heal_failures` enabled):**
|
||||
|
||||
- **test-healing-patterns.md** - Common failure patterns and automated fixes (selectors, timing, data, network, hard waits)
|
||||
- **selector-resilience.md** - Robust selector strategies and debugging (data-testid hierarchy, filter vs nth, anti-patterns)
|
||||
- **timing-debugging.md** - Race condition identification and deterministic wait fixes (network-first, event-based waits)
|
||||
|
||||
See `tea-index.csv` for complete knowledge fragment mapping (22 fragments total).
|
||||
|
||||
## Example Output
|
||||
|
||||
### BMad-Integrated Mode
|
||||
|
||||
````markdown
|
||||
# Automation Summary - User Authentication
|
||||
|
||||
**Date:** 2025-10-14
|
||||
**Story:** Epic 3, Story 5
|
||||
**Coverage Target:** critical-paths
|
||||
|
||||
## Tests Created
|
||||
|
||||
### E2E Tests (2 tests, P0-P1)
|
||||
|
||||
- `tests/e2e/user-authentication.spec.ts` (87 lines)
|
||||
- [P0] Login with valid credentials → Dashboard loads
|
||||
- [P1] Display error for invalid credentials
|
||||
|
||||
### API Tests (3 tests, P1-P2)
|
||||
|
||||
- `tests/api/auth.api.spec.ts` (102 lines)
|
||||
- [P1] POST /auth/login - valid credentials → 200 + token
|
||||
- [P1] POST /auth/login - invalid credentials → 401 + error
|
||||
- [P2] POST /auth/login - missing fields → 400 + validation
|
||||
|
||||
### Component Tests (2 tests, P1)
|
||||
|
||||
- `tests/component/LoginForm.test.tsx` (45 lines)
|
||||
- [P1] Empty fields → submit button disabled
|
||||
- [P1] Valid input → submit button enabled
|
||||
|
||||
## Infrastructure Created
|
||||
|
||||
- Fixtures: `tests/support/fixtures/auth.fixture.ts`
|
||||
- Factories: `tests/support/factories/user.factory.ts`
|
||||
|
||||
## Test Execution
|
||||
|
||||
```bash
|
||||
npm run test:e2e # Run all tests
|
||||
npm run test:e2e:p0 # Critical paths only
|
||||
npm run test:e2e:p1 # P0 + P1 tests
|
||||
```
|
||||
````
|
||||
|
||||
## Coverage Analysis
|
||||
|
||||
**Total:** 7 tests (P0: 1, P1: 5, P2: 1)
|
||||
**Levels:** E2E: 2, API: 3, Component: 2
|
||||
|
||||
✅ All acceptance criteria covered
|
||||
✅ Happy path (E2E + API)
|
||||
✅ Error cases (API)
|
||||
✅ UI validation (Component)
|
||||
|
||||
````
|
||||
|
||||
### Standalone Mode
|
||||
|
||||
```markdown
|
||||
# Automation Summary - src/auth/
|
||||
|
||||
**Date:** 2025-10-14
|
||||
**Target:** src/auth/ (standalone analysis)
|
||||
**Coverage Target:** critical-paths
|
||||
|
||||
## Feature Analysis
|
||||
|
||||
**Source Files Analyzed:**
|
||||
- `src/auth/login.ts`
|
||||
- `src/auth/session.ts`
|
||||
- `src/auth/validation.ts`
|
||||
|
||||
**Existing Coverage:** 0 tests found
|
||||
|
||||
**Coverage Gaps:**
|
||||
- ❌ No E2E tests for login flow
|
||||
- ❌ No API tests for /auth/login endpoint
|
||||
- ❌ No unit tests for validateEmail()
|
||||
|
||||
## Tests Created
|
||||
|
||||
{Same structure as BMad-Integrated mode}
|
||||
|
||||
## Recommendations
|
||||
|
||||
1. **High Priority (P0-P1):**
|
||||
- Add E2E test for password reset flow
|
||||
- Add API tests for token refresh endpoint
|
||||
|
||||
2. **Medium Priority (P2):**
|
||||
- Add unit tests for session timeout logic
|
||||
````
|
||||
|
||||
Ready to continue?
|
||||
@@ -1,493 +0,0 @@
|
||||
# CI/CD Pipeline Setup Workflow
|
||||
|
||||
Scaffolds a production-ready CI/CD quality pipeline with test execution, burn-in loops for flaky test detection, parallel sharding, and artifact collection. This workflow creates platform-specific CI configuration optimized for fast feedback (< 45 min total) and reliable test execution with 20× speedup over sequential runs.
|
||||
|
||||
## Usage
|
||||
|
||||
```bash
|
||||
bmad tea *ci
|
||||
```
|
||||
|
||||
The TEA agent runs this workflow when:
|
||||
|
||||
- Test framework is configured and tests pass locally
|
||||
- Team is ready to enable continuous integration
|
||||
- Existing CI pipeline needs optimization or modernization
|
||||
- Burn-in loop is needed for flaky test detection
|
||||
|
||||
## Inputs
|
||||
|
||||
**Required Context Files:**
|
||||
|
||||
- **Framework config** (playwright.config.ts, cypress.config.ts): Determines test commands and configuration
|
||||
- **package.json**: Dependencies and scripts for caching strategy
|
||||
- **.nvmrc**: Node version for CI (optional, defaults to Node 20 LTS)
|
||||
|
||||
**Optional Context Files:**
|
||||
|
||||
- **Existing CI config**: To update rather than create new
|
||||
- **.git/config**: For CI platform auto-detection
|
||||
|
||||
**Workflow Variables:**
|
||||
|
||||
- `ci_platform`: Auto-detected (github-actions/gitlab-ci/circle-ci) or explicit
|
||||
- `test_framework`: Detected from framework config (playwright/cypress)
|
||||
- `parallel_jobs`: Number of parallel shards (default: 4)
|
||||
- `burn_in_enabled`: Enable burn-in loop (default: true)
|
||||
- `burn_in_iterations`: Burn-in iterations (default: 10)
|
||||
- `selective_testing_enabled`: Run only changed tests (default: true)
|
||||
- `artifact_retention_days`: Artifact storage duration (default: 30)
|
||||
- `cache_enabled`: Enable dependency caching (default: true)
|
||||
|
||||
## Outputs
|
||||
|
||||
**Primary Deliverables:**
|
||||
|
||||
1. **CI Configuration File**
|
||||
- `.github/workflows/test.yml` (GitHub Actions)
|
||||
- `.gitlab-ci.yml` (GitLab CI)
|
||||
- Platform-specific optimizations and best practices
|
||||
|
||||
2. **Pipeline Stages**
|
||||
- **Lint**: Code quality checks (<2 min)
|
||||
- **Test**: Parallel execution with 4 shards (<10 min per shard)
|
||||
- **Burn-In**: Flaky test detection with 10 iterations (<30 min)
|
||||
- **Report**: Aggregate results and publish artifacts
|
||||
|
||||
3. **Helper Scripts**
|
||||
- `scripts/test-changed.sh`: Selective testing (run only affected tests)
|
||||
- `scripts/ci-local.sh`: Local CI mirror for debugging
|
||||
- `scripts/burn-in.sh`: Standalone burn-in execution
|
||||
|
||||
4. **Documentation**
|
||||
- `docs/ci.md`: Pipeline guide, debugging, secrets setup
|
||||
- `docs/ci-secrets-checklist.md`: Required secrets and configuration
|
||||
- Inline comments in CI configuration files
|
||||
|
||||
5. **Optimization Features**
|
||||
- Dependency caching (npm + browser binaries): 2-5 min savings
|
||||
- Parallel sharding: 75% time reduction
|
||||
- Retry logic: Handles transient failures (2 retries)
|
||||
- Failure-only artifacts: Cost-effective debugging
|
||||
|
||||
**Performance Targets:**
|
||||
|
||||
- Lint: <2 minutes
|
||||
- Test (per shard): <10 minutes
|
||||
- Burn-in: <30 minutes
|
||||
- **Total: <45 minutes** (20× faster than sequential)
|
||||
|
||||
**Validation Safeguards:**
|
||||
|
||||
- ✅ Git repository initialized
|
||||
- ✅ Local tests pass before CI setup
|
||||
- ✅ Framework configuration exists
|
||||
- ✅ CI platform accessible
|
||||
|
||||
## Key Features
|
||||
|
||||
### Burn-In Loop for Flaky Test Detection
|
||||
|
||||
**Critical production pattern:**
|
||||
|
||||
```yaml
|
||||
burn-in:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- run: |
|
||||
for i in {1..10}; do
|
||||
echo "🔥 Burn-in iteration $i/10"
|
||||
npm run test:e2e || exit 1
|
||||
done
|
||||
```
|
||||
|
||||
**Purpose**: Runs tests 10 times to catch non-deterministic failures before they reach main branch.
|
||||
|
||||
**When to run:**
|
||||
|
||||
- On PRs to main/develop
|
||||
- Weekly on cron schedule
|
||||
- After test infrastructure changes
|
||||
|
||||
**Failure threshold**: Even ONE failure → tests are flaky, must fix before merging.
|
||||
|
||||
### Parallel Sharding
|
||||
|
||||
**Splits tests across 4 jobs:**
|
||||
|
||||
```yaml
|
||||
strategy:
|
||||
matrix:
|
||||
shard: [1, 2, 3, 4]
|
||||
steps:
|
||||
- run: npm run test:e2e -- --shard=${{ matrix.shard }}/4
|
||||
```
|
||||
|
||||
**Benefits:**
|
||||
|
||||
- 75% time reduction (40 min → 10 min per shard)
|
||||
- Faster feedback on PRs
|
||||
- Configurable shard count
|
||||
|
||||
### Smart Caching
|
||||
|
||||
**Node modules + browser binaries:**
|
||||
|
||||
```yaml
|
||||
- uses: actions/cache@v4
|
||||
with:
|
||||
path: ~/.npm
|
||||
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
|
||||
```
|
||||
|
||||
**Benefits:**
|
||||
|
||||
- 2-5 min savings per run
|
||||
- Consistent across builds
|
||||
- Automatic invalidation on dependency changes
|
||||
|
||||
### Selective Testing
|
||||
|
||||
**Run only tests affected by code changes:**
|
||||
|
||||
```bash
|
||||
# scripts/test-changed.sh
|
||||
CHANGED_FILES=$(git diff --name-only HEAD~1)
|
||||
npm run test:e2e -- --grep="$AFFECTED_TESTS"
|
||||
```
|
||||
|
||||
**Benefits:**
|
||||
|
||||
- 50-80% time reduction for focused PRs
|
||||
- Faster feedback cycle
|
||||
- Full suite still runs on main branch
|
||||
|
||||
### Failure-Only Artifacts
|
||||
|
||||
**Upload debugging materials only on test failures:**
|
||||
|
||||
- Traces (Playwright): 5-10 MB per test
|
||||
- Screenshots: 100-500 KB each
|
||||
- Videos: 2-5 MB per test
|
||||
- HTML reports: 1-2 MB
|
||||
|
||||
**Benefits:**
|
||||
|
||||
- Reduces storage costs by 90%
|
||||
- Maintains full debugging capability
|
||||
- 30-day retention default
|
||||
|
||||
### Local CI Mirror
|
||||
|
||||
**Debug CI failures locally:**
|
||||
|
||||
```bash
|
||||
./scripts/ci-local.sh
|
||||
# Runs: lint → test → burn-in (3 iterations)
|
||||
```
|
||||
|
||||
**Mirrors CI environment:**
|
||||
|
||||
- Same Node version
|
||||
- Same commands
|
||||
- Reduced burn-in (3 vs 10 for faster feedback)
|
||||
|
||||
### Knowledge Base Integration
|
||||
|
||||
Automatically consults TEA knowledge base:
|
||||
|
||||
- `ci-burn-in.md` - Burn-in loop patterns and iterations
|
||||
- `selective-testing.md` - Changed test detection strategies
|
||||
- `visual-debugging.md` - Artifact collection best practices
|
||||
- `test-quality.md` - CI-specific quality criteria
|
||||
|
||||
## Integration with Other Workflows
|
||||
|
||||
**Before ci:**
|
||||
|
||||
- **framework**: Sets up test infrastructure and configuration
|
||||
- **test-design** (optional): Plans test coverage strategy
|
||||
|
||||
**After ci:**
|
||||
|
||||
- **atdd**: Generate failing tests that run in CI
|
||||
- **automate**: Expand test coverage that CI executes
|
||||
- **trace (Phase 2)**: Use CI results for quality gate decisions
|
||||
|
||||
**Coordinates with:**
|
||||
|
||||
- **dev-story**: Tests run in CI after story implementation
|
||||
- **retrospective**: CI metrics inform process improvements
|
||||
|
||||
**Updates:**
|
||||
|
||||
- `bmm-workflow-status.md`: Adds CI setup to Quality & Testing Progress section
|
||||
|
||||
## Important Notes
|
||||
|
||||
### CI Platform Auto-Detection
|
||||
|
||||
**GitHub Actions** (default):
|
||||
|
||||
- Auto-selected if `github.com` in git remote
|
||||
- Free 2000 min/month for private repos
|
||||
- Unlimited for public repos
|
||||
- `.github/workflows/test.yml`
|
||||
|
||||
**GitLab CI**:
|
||||
|
||||
- Auto-selected if `gitlab.com` in git remote
|
||||
- Free 400 min/month
|
||||
- `.gitlab-ci.yml`
|
||||
|
||||
**Circle CI** / **Jenkins**:
|
||||
|
||||
- User must specify explicitly
|
||||
- Templates provided for both
|
||||
|
||||
### Burn-In Strategy
|
||||
|
||||
**Iterations:**
|
||||
|
||||
- **3**: Quick feedback (local development)
|
||||
- **10**: Standard (PR checks) ← recommended
|
||||
- **100**: High-confidence (release branches)
|
||||
|
||||
**When to run:**
|
||||
|
||||
- ✅ On PRs to main/develop
|
||||
- ✅ Weekly scheduled (cron)
|
||||
- ✅ After test infra changes
|
||||
- ❌ Not on every commit (too slow)
|
||||
|
||||
**Cost-benefit:**
|
||||
|
||||
- 30 minutes of CI time → Prevents hours of debugging flaky tests
|
||||
|
||||
### Artifact Collection Strategy
|
||||
|
||||
**Failure-only collection:**
|
||||
|
||||
- Saves 90% storage costs
|
||||
- Maintains debugging capability
|
||||
- Automatic cleanup after retention period
|
||||
|
||||
**What to collect:**
|
||||
|
||||
- Traces: Full execution context (Playwright)
|
||||
- Screenshots: Visual evidence
|
||||
- Videos: Interaction playback
|
||||
- HTML reports: Detailed results
|
||||
- Console logs: Error messages
|
||||
|
||||
**What NOT to collect:**
|
||||
|
||||
- Passing test artifacts (waste of space)
|
||||
- Large binaries
|
||||
- Sensitive data (use secrets instead)
|
||||
|
||||
### Selective Testing Trade-offs
|
||||
|
||||
**Benefits:**
|
||||
|
||||
- 50-80% time reduction for focused changes
|
||||
- Faster feedback loop
|
||||
- Lower CI costs
|
||||
|
||||
**Risks:**
|
||||
|
||||
- May miss integration issues
|
||||
- Relies on accurate change detection
|
||||
- False positives if detection is too aggressive
|
||||
|
||||
**Mitigation:**
|
||||
|
||||
- Always run full suite on merge to main
|
||||
- Use burn-in loop on main branch
|
||||
- Monitor for missed issues
|
||||
|
||||
### Parallelism Configuration
|
||||
|
||||
**4 shards** (default):
|
||||
|
||||
- Optimal for 40-80 test files
|
||||
- ~10 min per shard
|
||||
- Balances speed vs resource usage
|
||||
|
||||
**Adjust if:**
|
||||
|
||||
- Tests complete in <5 min → reduce shards
|
||||
- Tests take >15 min → increase shards
|
||||
- CI limits concurrent jobs → reduce shards
|
||||
|
||||
**Formula:**
|
||||
|
||||
```
|
||||
Total test time / Target shard time = Optimal shards
|
||||
Example: 40 min / 10 min = 4 shards
|
||||
```
|
||||
|
||||
### Retry Logic
|
||||
|
||||
**2 retries** (default):
|
||||
|
||||
- Handles transient network issues
|
||||
- Mitigates race conditions
|
||||
- Does NOT mask flaky tests (burn-in catches those)
|
||||
|
||||
**When retries trigger:**
|
||||
|
||||
- Network timeouts
|
||||
- Service unavailability
|
||||
- Resource constraints
|
||||
|
||||
**When retries DON'T help:**
|
||||
|
||||
- Assertion failures (logic errors)
|
||||
- Flaky tests (non-deterministic)
|
||||
- Configuration errors
|
||||
|
||||
### Notification Setup (Optional)
|
||||
|
||||
**Supported channels:**
|
||||
|
||||
- Slack: Webhook integration
|
||||
- Email: SMTP configuration
|
||||
- Discord: Webhook integration
|
||||
|
||||
**Configuration:**
|
||||
|
||||
```yaml
|
||||
notify_on_failure: true
|
||||
notification_channels: 'slack'
|
||||
# Requires SLACK_WEBHOOK secret in CI settings
|
||||
```
|
||||
|
||||
**Best practice:** Enable for main/develop branches only, not PRs.
|
||||
|
||||
## Validation Checklist
|
||||
|
||||
After workflow completion, verify:
|
||||
|
||||
- [ ] CI configuration file created and syntactically valid
|
||||
- [ ] Burn-in loop configured (10 iterations)
|
||||
- [ ] Parallel sharding enabled (4 jobs)
|
||||
- [ ] Caching configured (dependencies + browsers)
|
||||
- [ ] Artifact collection on failure only
|
||||
- [ ] Helper scripts created and executable
|
||||
- [ ] Documentation complete (ci.md, secrets checklist)
|
||||
- [ ] No errors or warnings during scaffold
|
||||
- [ ] First CI run triggered and passes
|
||||
|
||||
Refer to `checklist.md` for comprehensive validation criteria.
|
||||
|
||||
## Example Execution
|
||||
|
||||
**Scenario 1: New GitHub Actions setup**
|
||||
|
||||
```bash
|
||||
bmad tea *ci
|
||||
|
||||
# TEA detects:
|
||||
# - GitHub repository (github.com in git remote)
|
||||
# - Playwright framework
|
||||
# - Node 20 from .nvmrc
|
||||
# - 60 test files
|
||||
|
||||
# TEA scaffolds:
|
||||
# - .github/workflows/test.yml
|
||||
# - 4-shard parallel execution
|
||||
# - Burn-in loop (10 iterations)
|
||||
# - Dependency + browser caching
|
||||
# - Failure artifacts (traces, screenshots)
|
||||
# - Helper scripts
|
||||
# - Documentation
|
||||
|
||||
# Result:
|
||||
# Total CI time: 42 minutes (was 8 hours sequential)
|
||||
# - Lint: 1.5 min
|
||||
# - Test (4 shards): 9 min each
|
||||
# - Burn-in: 28 min
|
||||
```
|
||||
|
||||
**Scenario 2: Update existing GitLab CI**
|
||||
|
||||
```bash
|
||||
bmad tea *ci
|
||||
|
||||
# TEA detects:
|
||||
# - Existing .gitlab-ci.yml
|
||||
# - Cypress framework
|
||||
# - No caching configured
|
||||
|
||||
# TEA asks: "Update existing CI or create new?"
|
||||
# User: "Update"
|
||||
|
||||
# TEA enhances:
|
||||
# - Adds burn-in job
|
||||
# - Configures caching (cache: paths)
|
||||
# - Adds parallel: 4
|
||||
# - Updates artifact collection
|
||||
# - Documents secrets needed
|
||||
|
||||
# Result:
|
||||
# CI time reduced from 45 min → 12 min
|
||||
```
|
||||
|
||||
**Scenario 3: Standalone burn-in setup**
|
||||
|
||||
```bash
|
||||
# User wants only burn-in, no full CI
|
||||
bmad tea *ci
|
||||
# Set burn_in_enabled: true, skip other stages
|
||||
|
||||
# TEA creates:
|
||||
# - Minimal workflow with burn-in only
|
||||
# - scripts/burn-in.sh for local testing
|
||||
# - Documentation for running burn-in
|
||||
|
||||
# Use case:
|
||||
# - Validate test stability before full CI setup
|
||||
# - Debug intermittent failures
|
||||
# - Confidence check before release
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**Issue: "Git repository not found"**
|
||||
|
||||
- **Cause**: No .git/ directory
|
||||
- **Solution**: Run `git init` and `git remote add origin <url>`
|
||||
|
||||
**Issue: "Tests fail locally but should set up CI anyway"**
|
||||
|
||||
- **Cause**: Workflow halts if local tests fail
|
||||
- **Solution**: Fix tests first, or temporarily skip preflight (not recommended)
|
||||
|
||||
**Issue: "CI takes longer than 10 min per shard"**
|
||||
|
||||
- **Cause**: Too many tests per shard
|
||||
- **Solution**: Increase shard count (e.g., 4 → 8)
|
||||
|
||||
**Issue: "Burn-in passes locally but fails in CI"**
|
||||
|
||||
- **Cause**: Environment differences (timing, resources)
|
||||
- **Solution**: Use `scripts/ci-local.sh` to mirror CI environment
|
||||
|
||||
**Issue: "Caching not working"**
|
||||
|
||||
- **Cause**: Cache key mismatch or cache limit exceeded
|
||||
- **Solution**: Check cache key formula, verify platform limits
|
||||
|
||||
## Related Workflows
|
||||
|
||||
- **framework**: Set up test infrastructure → [framework/README.md](../framework/README.md)
|
||||
- **atdd**: Generate acceptance tests → [atdd/README.md](../atdd/README.md)
|
||||
- **automate**: Expand test coverage → [automate/README.md](../automate/README.md)
|
||||
- **trace**: Traceability and quality gate decisions → [trace/README.md](../trace/README.md)
|
||||
|
||||
## Version History
|
||||
|
||||
- **v4.0 (BMad v6)**: Pure markdown instructions, enhanced workflow.yaml, burn-in loop integration
|
||||
- **v3.x**: XML format instructions, basic CI setup
|
||||
- **v2.x**: Legacy task-based approach
|
||||
@@ -1,340 +0,0 @@
|
||||
# Test Framework Setup Workflow
|
||||
|
||||
Initializes a production-ready test framework architecture (Playwright or Cypress) with fixtures, helpers, configuration, and industry best practices. This workflow scaffolds the complete testing infrastructure for modern web applications, providing a robust foundation for test automation.
|
||||
|
||||
## Usage
|
||||
|
||||
```bash
|
||||
bmad tea *framework
|
||||
```
|
||||
|
||||
The TEA agent runs this workflow when:
|
||||
|
||||
- Starting a new project that needs test infrastructure
|
||||
- Migrating from an older testing approach
|
||||
- Setting up testing from scratch
|
||||
- Standardizing test architecture across teams
|
||||
|
||||
## Inputs
|
||||
|
||||
**Required Context Files:**
|
||||
|
||||
- **package.json**: Project dependencies and scripts to detect project type and bundler
|
||||
|
||||
**Optional Context Files:**
|
||||
|
||||
- **Architecture docs** (architecture.md, tech-spec.md): Informs framework configuration decisions
|
||||
- **Existing tests**: Detects current framework to avoid conflicts
|
||||
|
||||
**Workflow Variables:**
|
||||
|
||||
- `test_framework`: Auto-detected (playwright/cypress) or manually specified
|
||||
- `project_type`: Auto-detected from package.json (react/vue/angular/next/node)
|
||||
- `bundler`: Auto-detected from package.json (vite/webpack/rollup/esbuild)
|
||||
- `test_dir`: Root test directory (default: `{project-root}/tests`)
|
||||
- `use_typescript`: Prefer TypeScript configuration (default: true)
|
||||
- `framework_preference`: Auto-detection or force specific framework (default: "auto")
|
||||
|
||||
## Outputs
|
||||
|
||||
**Primary Deliverables:**
|
||||
|
||||
1. **Configuration File**
|
||||
- `playwright.config.ts` or `cypress.config.ts` with production-ready settings
|
||||
- Timeouts: action 15s, navigation 30s, test 60s
|
||||
- Reporters: HTML + JUnit XML
|
||||
- Failure-only artifacts (traces, screenshots, videos)
|
||||
|
||||
2. **Directory Structure**
|
||||
|
||||
```
|
||||
tests/
|
||||
├── e2e/ # Test files (organize as needed)
|
||||
├── support/ # Framework infrastructure (key pattern)
|
||||
│ ├── fixtures/ # Test fixtures with auto-cleanup
|
||||
│ │ ├── index.ts # Fixture merging
|
||||
│ │ └── factories/ # Data factories (faker-based)
|
||||
│ ├── helpers/ # Utility functions
|
||||
│ └── page-objects/ # Page object models (optional)
|
||||
└── README.md # Setup and usage guide
|
||||
```
|
||||
|
||||
**Note**: Test organization (e2e/, api/, integration/, etc.) is flexible. The **support/** folder contains reusable fixtures, helpers, and factories - the core framework pattern.
|
||||
|
||||
3. **Environment Configuration**
|
||||
- `.env.example` with `TEST_ENV`, `BASE_URL`, `API_URL`, auth credentials
|
||||
- `.nvmrc` with Node version (LTS)
|
||||
|
||||
4. **Test Infrastructure**
|
||||
- Fixture architecture using `mergeTests` pattern
|
||||
- Data factories with auto-cleanup (faker-based)
|
||||
- Sample tests demonstrating best practices
|
||||
- Helper utilities for common operations
|
||||
|
||||
5. **Documentation**
|
||||
- `tests/README.md` with comprehensive setup instructions
|
||||
- Inline comments explaining configuration choices
|
||||
- References to TEA knowledge base
|
||||
|
||||
**Secondary Deliverables:**
|
||||
|
||||
- Updated `package.json` with minimal test script (`test:e2e`)
|
||||
- Sample test demonstrating fixture usage
|
||||
- Network-first testing patterns
|
||||
- Selector strategy guidance (data-testid)
|
||||
|
||||
**Validation Safeguards:**
|
||||
|
||||
- ✅ No existing framework detected (prevents conflicts)
|
||||
- ✅ package.json exists and is valid
|
||||
- ✅ Framework auto-detection successful or explicit choice provided
|
||||
- ✅ Sample test runs successfully
|
||||
- ✅ All generated files are syntactically correct
|
||||
|
||||
## Key Features
|
||||
|
||||
### Smart Framework Selection
|
||||
|
||||
- **Auto-detection logic** based on project characteristics:
|
||||
- **Playwright** recommended for: Large repos (100+ files), performance-critical apps, multi-browser support, complex debugging needs
|
||||
- **Cypress** recommended for: Small teams prioritizing DX, component testing focus, real-time test development
|
||||
- Falls back to Playwright as default if uncertain
|
||||
|
||||
### Production-Ready Patterns
|
||||
|
||||
- **Fixture Architecture**: Pure function → fixture → `mergeTests` composition pattern
|
||||
- **Auto-Cleanup**: Fixtures automatically clean up test data in teardown
|
||||
- **Network-First**: Route interception before navigation to prevent race conditions
|
||||
- **Failure-Only Artifacts**: Screenshots/videos/traces only captured on failure to reduce storage
|
||||
- **Parallel Execution**: Configured for optimal CI performance
|
||||
|
||||
### Industry Best Practices
|
||||
|
||||
- **Selector Strategy**: Prescriptive guidance on `data-testid` attributes
|
||||
- **Data Factories**: Faker-based factories for realistic test data
|
||||
- **Contract Testing**: Recommends Pact for microservices architectures
|
||||
- **Error Handling**: Comprehensive timeout and retry configuration
|
||||
- **Reporting**: Multiple reporter formats (HTML, JUnit, console)
|
||||
|
||||
### Knowledge Base Integration
|
||||
|
||||
Automatically consults TEA knowledge base:
|
||||
|
||||
- `fixture-architecture.md` - Pure function → fixture → mergeTests pattern
|
||||
- `data-factories.md` - Faker-based factories with auto-cleanup
|
||||
- `network-first.md` - Network interception before navigation
|
||||
- `playwright-config.md` - Playwright-specific best practices
|
||||
- `test-config.md` - General configuration guidelines
|
||||
|
||||
## Integration with Other Workflows
|
||||
|
||||
**Before framework:**
|
||||
|
||||
- **prd** (Phase 2): Determines project scope and testing needs
|
||||
- **workflow-status**: Verifies project readiness
|
||||
|
||||
**After framework:**
|
||||
|
||||
- **ci**: Scaffold CI/CD pipeline using framework configuration
|
||||
- **test-design**: Plan test coverage strategy for the project
|
||||
- **atdd**: Generate failing acceptance tests using the framework
|
||||
|
||||
**Coordinates with:**
|
||||
|
||||
- **architecture** (Phase 3): Aligns test structure with system architecture
|
||||
- **tech-spec**: Uses technical specifications to inform test configuration
|
||||
|
||||
**Updates:**
|
||||
|
||||
- `bmm-workflow-status.md`: Adds framework initialization to Quality & Testing Progress section
|
||||
|
||||
## Important Notes
|
||||
|
||||
### Preflight Checks
|
||||
|
||||
**Critical requirements** verified before scaffolding:
|
||||
|
||||
- package.json exists in project root
|
||||
- No modern E2E framework already configured
|
||||
- Architecture/stack context available
|
||||
|
||||
If any check fails, workflow **HALTS** and notifies user.
|
||||
|
||||
### Framework-Specific Guidance
|
||||
|
||||
**Playwright Advantages:**
|
||||
|
||||
- Worker parallelism (significantly faster for large suites)
|
||||
- Trace viewer (powerful debugging with screenshots, network, console logs)
|
||||
- Multi-language support (TypeScript, JavaScript, Python, C#, Java)
|
||||
- Built-in API testing capabilities
|
||||
- Better handling of multiple browser contexts
|
||||
|
||||
**Cypress Advantages:**
|
||||
|
||||
- Superior developer experience (real-time reloading)
|
||||
- Excellent for component testing
|
||||
- Simpler setup for small teams
|
||||
- Better suited for watch mode during development
|
||||
|
||||
**Avoid Cypress when:**
|
||||
|
||||
- API chains are heavy and complex
|
||||
- Multi-tab/window scenarios are common
|
||||
- Worker parallelism is critical for CI performance
|
||||
|
||||
### Selector Strategy
|
||||
|
||||
**Always recommend:**
|
||||
|
||||
- `data-testid` attributes for UI elements (framework-agnostic)
|
||||
- `data-cy` attributes if Cypress is chosen (Cypress-specific)
|
||||
- Avoid brittle CSS selectors or XPath
|
||||
|
||||
### Standalone Operation
|
||||
|
||||
This workflow operates independently:
|
||||
|
||||
- **No story required**: Can be run at project initialization
|
||||
- **No epic context needed**: Works for greenfield and brownfield projects
|
||||
- **Autonomous**: Auto-detects configuration and proceeds without user input
|
||||
|
||||
### Output Summary Format
|
||||
|
||||
After completion, provides structured summary:
|
||||
|
||||
```markdown
|
||||
## Framework Scaffold Complete
|
||||
|
||||
**Framework Selected**: Playwright (or Cypress)
|
||||
|
||||
**Artifacts Created**:
|
||||
|
||||
- ✅ Configuration file: playwright.config.ts
|
||||
- ✅ Directory structure: tests/e2e/, tests/support/
|
||||
- ✅ Environment config: .env.example
|
||||
- ✅ Node version: .nvmrc
|
||||
- ✅ Fixture architecture: tests/support/fixtures/
|
||||
- ✅ Data factories: tests/support/fixtures/factories/
|
||||
- ✅ Sample tests: tests/e2e/example.spec.ts
|
||||
- ✅ Documentation: tests/README.md
|
||||
|
||||
**Next Steps**:
|
||||
|
||||
1. Copy .env.example to .env and fill in environment variables
|
||||
2. Run npm install to install test dependencies
|
||||
3. Run npm run test:e2e to execute sample tests
|
||||
4. Review tests/README.md for detailed setup instructions
|
||||
|
||||
**Knowledge Base References Applied**:
|
||||
|
||||
- Fixture architecture pattern (pure functions + mergeTests)
|
||||
- Data factories with auto-cleanup (faker-based)
|
||||
- Network-first testing safeguards
|
||||
- Failure-only artifact capture
|
||||
```
|
||||
|
||||
## Validation Checklist
|
||||
|
||||
After workflow completion, verify:
|
||||
|
||||
- [ ] Configuration file created and syntactically valid
|
||||
- [ ] Directory structure exists with all folders
|
||||
- [ ] Environment configuration generated (.env.example, .nvmrc)
|
||||
- [ ] Sample tests run successfully (npm run test:e2e)
|
||||
- [ ] Documentation complete and accurate (tests/README.md)
|
||||
- [ ] No errors or warnings during scaffold
|
||||
- [ ] package.json scripts updated correctly
|
||||
- [ ] Fixtures and factories follow patterns from knowledge base
|
||||
|
||||
Refer to `checklist.md` for comprehensive validation criteria.
|
||||
|
||||
## Example Execution
|
||||
|
||||
**Scenario 1: New React + Vite project**
|
||||
|
||||
```bash
|
||||
# User runs framework workflow
|
||||
bmad tea *framework
|
||||
|
||||
# TEA detects:
|
||||
# - React project (from package.json)
|
||||
# - Vite bundler
|
||||
# - No existing test framework
|
||||
# - 150+ files (recommends Playwright)
|
||||
|
||||
# TEA scaffolds:
|
||||
# - playwright.config.ts with Vite detection
|
||||
# - Component testing configuration
|
||||
# - React Testing Library helpers
|
||||
# - Sample component + E2E tests
|
||||
```
|
||||
|
||||
**Scenario 2: Existing Node.js API project**
|
||||
|
||||
```bash
|
||||
# User runs framework workflow
|
||||
bmad tea *framework
|
||||
|
||||
# TEA detects:
|
||||
# - Node.js backend (no frontend framework)
|
||||
# - Express framework
|
||||
# - Small project (50 files)
|
||||
# - API endpoints in routes/
|
||||
|
||||
# TEA scaffolds:
|
||||
# - playwright.config.ts focused on API testing
|
||||
# - tests/api/ directory structure
|
||||
# - API helper utilities
|
||||
# - Sample API tests with auth
|
||||
```
|
||||
|
||||
**Scenario 3: Cypress preferred (explicit)**
|
||||
|
||||
```bash
|
||||
# User sets framework preference
|
||||
# (in workflow config: framework_preference: "cypress")
|
||||
|
||||
bmad tea *framework
|
||||
|
||||
# TEA scaffolds:
|
||||
# - cypress.config.ts
|
||||
# - tests/e2e/ with Cypress patterns
|
||||
# - Cypress-specific commands
|
||||
# - data-cy selector strategy
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**Issue: "Existing test framework detected"**
|
||||
|
||||
- **Cause**: playwright.config._ or cypress.config._ already exists
|
||||
- **Solution**: Use `upgrade-framework` workflow (TBD) or manually remove existing config
|
||||
|
||||
**Issue: "Cannot detect project type"**
|
||||
|
||||
- **Cause**: package.json missing or malformed
|
||||
- **Solution**: Ensure package.json exists and has valid dependencies
|
||||
|
||||
**Issue: "Sample test fails to run"**
|
||||
|
||||
- **Cause**: Missing dependencies or incorrect BASE_URL
|
||||
- **Solution**: Run `npm install` and configure `.env` with correct URLs
|
||||
|
||||
**Issue: "TypeScript compilation errors"**
|
||||
|
||||
- **Cause**: Missing @types packages or tsconfig misconfiguration
|
||||
- **Solution**: Ensure TypeScript and type definitions are installed
|
||||
|
||||
## Related Workflows
|
||||
|
||||
- **ci**: Scaffold CI/CD pipeline → [ci/README.md](../ci/README.md)
|
||||
- **test-design**: Plan test coverage → [test-design/README.md](../test-design/README.md)
|
||||
- **atdd**: Generate acceptance tests → [atdd/README.md](../atdd/README.md)
|
||||
- **automate**: Expand regression suite → [automate/README.md](../automate/README.md)
|
||||
|
||||
## Version History
|
||||
|
||||
- **v4.0 (BMad v6)**: Pure markdown instructions, enhanced workflow.yaml, comprehensive README
|
||||
- **v3.x**: XML format instructions
|
||||
- **v2.x**: Legacy task-based approach
|
||||
@@ -1,469 +0,0 @@
|
||||
# Non-Functional Requirements Assessment Workflow
|
||||
|
||||
**Workflow ID:** `testarch-nfr`
|
||||
**Agent:** Test Architect (TEA)
|
||||
**Command:** `bmad tea *nfr-assess`
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
The **nfr-assess** workflow performs a comprehensive assessment of non-functional requirements (NFRs) to validate that the implementation meets performance, security, reliability, and maintainability standards before release. It uses evidence-based validation with deterministic PASS/CONCERNS/FAIL rules and provides actionable recommendations for remediation.
|
||||
|
||||
**Key Features:**
|
||||
|
||||
- Assess multiple NFR categories (performance, security, reliability, maintainability, custom)
|
||||
- Validate NFRs against defined thresholds from tech specs, PRD, or defaults
|
||||
- Classify status deterministically (PASS/CONCERNS/FAIL) based on evidence
|
||||
- Never guess thresholds - mark as CONCERNS if unknown
|
||||
- Generate CI/CD-ready YAML snippets for quality gates
|
||||
- Provide quick wins and recommended actions for remediation
|
||||
- Create evidence checklists for gaps
|
||||
|
||||
---
|
||||
|
||||
## When to Use This Workflow
|
||||
|
||||
Use `*nfr-assess` when you need to:
|
||||
|
||||
- ✅ Validate non-functional requirements before release
|
||||
- ✅ Assess performance against defined thresholds
|
||||
- ✅ Verify security requirements are met
|
||||
- ✅ Validate reliability and error handling
|
||||
- ✅ Check maintainability standards (coverage, quality, documentation)
|
||||
- ✅ Generate NFR assessment reports for stakeholders
|
||||
- ✅ Create gate-ready metrics for CI/CD pipelines
|
||||
|
||||
**Typical Timing:**
|
||||
|
||||
- Before release (validate all NFRs)
|
||||
- Before PR merge (validate critical NFRs)
|
||||
- During sprint retrospectives (assess maintainability)
|
||||
- After performance testing (validate performance NFRs)
|
||||
- After security audit (validate security NFRs)
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
**Required:**
|
||||
|
||||
- Implementation deployed locally or accessible for evaluation
|
||||
- Evidence sources available (test results, metrics, logs, CI results)
|
||||
|
||||
**Recommended:**
|
||||
|
||||
- NFR requirements defined in tech-spec.md, PRD.md, or story
|
||||
- Test results from performance, security, reliability tests
|
||||
- Application metrics (response times, error rates, throughput)
|
||||
- CI/CD pipeline results for burn-in validation
|
||||
|
||||
**Halt Conditions:**
|
||||
|
||||
- NFR targets are undefined and cannot be obtained → Halt and request definition
|
||||
- Implementation is not accessible for evaluation → Halt and request deployment
|
||||
|
||||
---
|
||||
|
||||
## Usage
|
||||
|
||||
### Basic Usage (BMad Mode)
|
||||
|
||||
```bash
|
||||
bmad tea *nfr-assess
|
||||
```
|
||||
|
||||
The workflow will:
|
||||
|
||||
1. Read tech-spec.md for NFR requirements
|
||||
2. Gather evidence from test results, metrics, logs
|
||||
3. Assess each NFR category against thresholds
|
||||
4. Generate NFR assessment report
|
||||
5. Save to `bmad/output/nfr-assessment.md`
|
||||
|
||||
### Standalone Mode (No Tech Spec)
|
||||
|
||||
```bash
|
||||
bmad tea *nfr-assess --feature-name "User Authentication"
|
||||
```
|
||||
|
||||
### Custom Configuration
|
||||
|
||||
```bash
|
||||
bmad tea *nfr-assess \
|
||||
--assess-performance true \
|
||||
--assess-security true \
|
||||
--assess-reliability true \
|
||||
--assess-maintainability true \
|
||||
--performance-response-time-ms 500 \
|
||||
--security-score-min 85
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Workflow Steps
|
||||
|
||||
1. **Load Context** - Read tech spec, PRD, knowledge base fragments
|
||||
2. **Identify NFRs** - Determine categories and thresholds
|
||||
3. **Gather Evidence** - Read test results, metrics, logs, CI results
|
||||
4. **Assess NFRs** - Apply deterministic PASS/CONCERNS/FAIL rules
|
||||
5. **Identify Actions** - Quick wins, recommended actions, monitoring hooks
|
||||
6. **Generate Deliverables** - NFR assessment report, gate YAML, evidence checklist
|
||||
|
||||
---
|
||||
|
||||
## Outputs
|
||||
|
||||
### NFR Assessment Report (`nfr-assessment.md`)
|
||||
|
||||
Comprehensive markdown file with:
|
||||
|
||||
- Executive summary (overall status, critical issues)
|
||||
- Assessment by category (performance, security, reliability, maintainability)
|
||||
- Evidence for each NFR (test results, metrics, thresholds)
|
||||
- Status classification (PASS/CONCERNS/FAIL)
|
||||
- Quick wins section
|
||||
- Recommended actions section
|
||||
- Evidence gaps checklist
|
||||
|
||||
### Gate YAML Snippet (Optional)
|
||||
|
||||
```yaml
|
||||
nfr_assessment:
|
||||
date: '2025-10-14'
|
||||
categories:
|
||||
performance: 'PASS'
|
||||
security: 'CONCERNS'
|
||||
reliability: 'PASS'
|
||||
maintainability: 'PASS'
|
||||
overall_status: 'CONCERNS'
|
||||
critical_issues: 0
|
||||
high_priority_issues: 1
|
||||
concerns: 1
|
||||
blockers: false
|
||||
```
|
||||
|
||||
### Evidence Checklist (Optional)
|
||||
|
||||
- List of NFRs with missing or incomplete evidence
|
||||
- Owners for evidence collection
|
||||
- Suggested evidence sources
|
||||
- Deadlines for evidence collection
|
||||
|
||||
---
|
||||
|
||||
## NFR Categories
|
||||
|
||||
### Performance
|
||||
|
||||
**Criteria:** Response time, throughput, resource usage, scalability
|
||||
**Thresholds (Default):**
|
||||
|
||||
- Response time p95: 500ms
|
||||
- Throughput: 100 RPS
|
||||
- CPU usage: < 70%
|
||||
- Memory usage: < 80%
|
||||
|
||||
**Evidence Sources:** Load test results, APM data, Lighthouse reports, Playwright traces
|
||||
|
||||
---
|
||||
|
||||
### Security
|
||||
|
||||
**Criteria:** Authentication, authorization, data protection, vulnerability management
|
||||
**Thresholds (Default):**
|
||||
|
||||
- Security score: >= 85/100
|
||||
- Critical vulnerabilities: 0
|
||||
- High vulnerabilities: < 3
|
||||
- MFA enabled
|
||||
|
||||
**Evidence Sources:** SAST results, DAST results, dependency scanning, pentest reports
|
||||
|
||||
---
|
||||
|
||||
### Reliability
|
||||
|
||||
**Criteria:** Availability, error handling, fault tolerance, disaster recovery
|
||||
**Thresholds (Default):**
|
||||
|
||||
- Uptime: >= 99.9%
|
||||
- Error rate: < 0.1%
|
||||
- MTTR: < 15 minutes
|
||||
- CI burn-in: 100 consecutive runs
|
||||
|
||||
**Evidence Sources:** Uptime monitoring, error logs, CI burn-in results, chaos tests
|
||||
|
||||
---
|
||||
|
||||
### Maintainability
|
||||
|
||||
**Criteria:** Code quality, test coverage, documentation, technical debt
|
||||
**Thresholds (Default):**
|
||||
|
||||
- Test coverage: >= 80%
|
||||
- Code quality: >= 85/100
|
||||
- Technical debt: < 5%
|
||||
- Documentation: >= 90%
|
||||
|
||||
**Evidence Sources:** Coverage reports, static analysis, documentation audit, test review
|
||||
|
||||
---
|
||||
|
||||
## Assessment Rules
|
||||
|
||||
### PASS ✅
|
||||
|
||||
- Evidence exists AND meets or exceeds threshold
|
||||
- No concerns flagged in evidence
|
||||
- Quality is acceptable
|
||||
|
||||
### CONCERNS ⚠️
|
||||
|
||||
- Threshold is UNKNOWN (not defined)
|
||||
- Evidence is MISSING or INCOMPLETE
|
||||
- Evidence is close to threshold (within 10%)
|
||||
- Evidence shows intermittent issues
|
||||
|
||||
### FAIL ❌
|
||||
|
||||
- Evidence exists BUT does not meet threshold
|
||||
- Critical evidence is MISSING
|
||||
- Evidence shows consistent failures
|
||||
- Quality is unacceptable
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
### workflow.yaml Variables
|
||||
|
||||
```yaml
|
||||
variables:
|
||||
# NFR categories to assess
|
||||
assess_performance: true
|
||||
assess_security: true
|
||||
assess_reliability: true
|
||||
assess_maintainability: true
|
||||
|
||||
# Custom NFR categories
|
||||
custom_nfr_categories: '' # e.g., "accessibility,compliance"
|
||||
|
||||
# Evidence sources
|
||||
test_results_dir: '{project-root}/test-results'
|
||||
metrics_dir: '{project-root}/metrics'
|
||||
logs_dir: '{project-root}/logs'
|
||||
include_ci_results: true
|
||||
|
||||
# Thresholds
|
||||
performance_response_time_ms: 500
|
||||
performance_throughput_rps: 100
|
||||
security_score_min: 85
|
||||
reliability_uptime_pct: 99.9
|
||||
maintainability_coverage_pct: 80
|
||||
|
||||
# Assessment configuration
|
||||
use_deterministic_rules: true
|
||||
never_guess_thresholds: true
|
||||
require_evidence: true
|
||||
suggest_monitoring: true
|
||||
|
||||
# Output configuration
|
||||
output_file: '{output_folder}/nfr-assessment.md'
|
||||
generate_gate_yaml: true
|
||||
generate_evidence_checklist: true
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Knowledge Base Integration
|
||||
|
||||
This workflow automatically loads relevant knowledge fragments:
|
||||
|
||||
- `nfr-criteria.md` - Non-functional requirements criteria
|
||||
- `ci-burn-in.md` - CI/CD burn-in patterns for reliability
|
||||
- `test-quality.md` - Test quality expectations (maintainability)
|
||||
- `playwright-config.md` - Performance configuration patterns
|
||||
|
||||
---
|
||||
|
||||
## Examples
|
||||
|
||||
### Example 1: Full NFR Assessment Before Release
|
||||
|
||||
```bash
|
||||
bmad tea *nfr-assess
|
||||
```
|
||||
|
||||
**Output:**
|
||||
|
||||
```markdown
|
||||
# NFR Assessment - Story 1.3
|
||||
|
||||
**Overall Status:** PASS ✅ (No blockers)
|
||||
|
||||
## Performance Assessment
|
||||
|
||||
- Response Time p95: PASS ✅ (320ms < 500ms threshold)
|
||||
- Throughput: PASS ✅ (250 RPS > 100 RPS threshold)
|
||||
|
||||
## Security Assessment
|
||||
|
||||
- Authentication: PASS ✅ (MFA enforced)
|
||||
- Data Protection: PASS ✅ (AES-256 + TLS 1.3)
|
||||
|
||||
## Reliability Assessment
|
||||
|
||||
- Uptime: PASS ✅ (99.95% > 99.9% threshold)
|
||||
- Error Rate: PASS ✅ (0.05% < 0.1% threshold)
|
||||
|
||||
## Maintainability Assessment
|
||||
|
||||
- Test Coverage: PASS ✅ (87% > 80% threshold)
|
||||
- Code Quality: PASS ✅ (92/100 > 85/100 threshold)
|
||||
|
||||
Gate Status: PASS ✅ - Ready for release
|
||||
```
|
||||
|
||||
### Example 2: NFR Assessment with Concerns
|
||||
|
||||
```bash
|
||||
bmad tea *nfr-assess --feature-name "User Authentication"
|
||||
```
|
||||
|
||||
**Output:**
|
||||
|
||||
```markdown
|
||||
# NFR Assessment - User Authentication
|
||||
|
||||
**Overall Status:** CONCERNS ⚠️ (1 HIGH issue)
|
||||
|
||||
## Security Assessment
|
||||
|
||||
### Authentication Strength
|
||||
|
||||
- **Status:** CONCERNS ⚠️
|
||||
- **Threshold:** MFA enabled for all users
|
||||
- **Actual:** MFA optional (not enforced)
|
||||
- **Evidence:** Security audit (security-audit-2025-10-14.md)
|
||||
- **Recommendation:** HIGH - Enforce MFA for all new accounts
|
||||
|
||||
## Quick Wins
|
||||
|
||||
1. **Enforce MFA (Security)** - HIGH - 4 hours
|
||||
- Add configuration flag to enforce MFA
|
||||
- No code changes needed
|
||||
|
||||
Gate Status: CONCERNS ⚠️ - Address HIGH priority issues before release
|
||||
```
|
||||
|
||||
### Example 3: Performance-Only Assessment
|
||||
|
||||
```bash
|
||||
bmad tea *nfr-assess \
|
||||
--assess-performance true \
|
||||
--assess-security false \
|
||||
--assess-reliability false \
|
||||
--assess-maintainability false
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "NFR thresholds not defined"
|
||||
|
||||
- Check tech-spec.md for NFR requirements
|
||||
- Check PRD.md for product-level SLAs
|
||||
- Check story file for feature-specific requirements
|
||||
- If thresholds truly unknown, mark as CONCERNS and recommend defining them
|
||||
|
||||
### "No evidence found"
|
||||
|
||||
- Check evidence directories (test-results, metrics, logs)
|
||||
- Check CI/CD pipeline for test results
|
||||
- If evidence truly missing, mark NFR as "NO EVIDENCE" and recommend generating it
|
||||
|
||||
### "CONCERNS status but no threshold exceeded"
|
||||
|
||||
- CONCERNS is correct when threshold is UNKNOWN or evidence is MISSING/INCOMPLETE
|
||||
- CONCERNS is also correct when evidence is close to threshold (within 10%)
|
||||
- Document why CONCERNS was assigned in assessment report
|
||||
|
||||
### "FAIL status blocks release"
|
||||
|
||||
- This is intentional - FAIL means critical NFR not met
|
||||
- Recommend remediation actions with specific steps
|
||||
- Re-run assessment after remediation
|
||||
|
||||
---
|
||||
|
||||
## Integration with Other Workflows
|
||||
|
||||
- **testarch-test-design** → `*nfr-assess` - Define NFR requirements, then assess
|
||||
- **testarch-framework** → `*nfr-assess` - Set up frameworks, then validate NFRs
|
||||
- **testarch-ci** → `*nfr-assess` - Configure CI, then assess reliability with burn-in
|
||||
- `*nfr-assess` → **testarch-trace (Phase 2)** - Assess NFRs, then apply quality gates
|
||||
- `*nfr-assess` → **testarch-test-review** - Assess maintainability, then review tests
|
||||
|
||||
---
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Never Guess Thresholds**
|
||||
- If threshold is unknown, mark as CONCERNS
|
||||
- Recommend defining threshold in tech-spec.md
|
||||
- Don't infer thresholds from similar features
|
||||
|
||||
2. **Evidence-Based Assessment**
|
||||
- Every assessment must be backed by evidence
|
||||
- Mark NFRs without evidence as "NO EVIDENCE"
|
||||
- Don't assume or infer - require explicit evidence
|
||||
|
||||
3. **Deterministic Rules**
|
||||
- Apply PASS/CONCERNS/FAIL consistently
|
||||
- Document reasoning for each classification
|
||||
- Use same rules across all NFR categories
|
||||
|
||||
4. **Actionable Recommendations**
|
||||
- Provide specific steps, not generic advice
|
||||
- Include priority, effort estimate, owner suggestion
|
||||
- Focus on quick wins first
|
||||
|
||||
5. **Gate Integration**
|
||||
- Enable `generate_gate_yaml` for CI/CD integration
|
||||
- Use YAML snippets in pipeline quality gates
|
||||
- Export metrics for dashboard visualization
|
||||
|
||||
---
|
||||
|
||||
## Quality Gates
|
||||
|
||||
| Status | Criteria | Action |
|
||||
| ----------- | ---------------------------- | --------------------------- |
|
||||
| PASS ✅ | All NFRs have PASS status | Ready for release |
|
||||
| CONCERNS ⚠️ | Any NFR has CONCERNS status | Address before next release |
|
||||
| FAIL ❌ | Critical NFR has FAIL status | Do not release - BLOCKER |
|
||||
|
||||
---
|
||||
|
||||
## Related Commands
|
||||
|
||||
- `bmad tea *test-design` - Define NFR requirements and test plan
|
||||
- `bmad tea *framework` - Set up performance/security testing frameworks
|
||||
- `bmad tea *ci` - Configure CI/CD for NFR validation
|
||||
- `bmad tea *trace` (Phase 2) - Apply quality gates using NFR assessment metrics
|
||||
- `bmad tea *test-review` - Review test quality (maintainability NFR)
|
||||
|
||||
---
|
||||
|
||||
## Resources
|
||||
|
||||
- [Instructions](./instructions.md) - Detailed workflow steps
|
||||
- [Checklist](./checklist.md) - Validation checklist
|
||||
- [Template](./nfr-report-template.md) - NFR assessment report template
|
||||
- [Knowledge Base](../../testarch/knowledge/) - NFR criteria and best practices
|
||||
|
||||
---
|
||||
|
||||
<!-- Powered by BMAD-CORE™ -->
|
||||
@@ -1,493 +0,0 @@
|
||||
# Test Design and Risk Assessment Workflow
|
||||
|
||||
Plans comprehensive test coverage strategy with risk assessment (probability × impact scoring), priority classification (P0-P3), and resource estimation. This workflow generates a test design document that identifies high-risk areas, maps requirements to appropriate test levels, and provides execution ordering for optimal feedback.
|
||||
|
||||
## Usage
|
||||
|
||||
```bash
|
||||
bmad tea *test-design
|
||||
```
|
||||
|
||||
The TEA agent runs this workflow when:
|
||||
|
||||
- Planning test coverage before development starts
|
||||
- Assessing risks for an epic or story
|
||||
- Prioritizing test scenarios by business impact
|
||||
- Estimating testing effort and resources
|
||||
|
||||
## Inputs
|
||||
|
||||
**Required Context Files:**
|
||||
|
||||
- **Story markdown**: Acceptance criteria and requirements
|
||||
- **PRD or epics.md**: High-level product context
|
||||
- **Architecture docs** (optional): Technical constraints and integration points
|
||||
|
||||
**Workflow Variables:**
|
||||
|
||||
- `epic_num`: Epic number for scoped design
|
||||
- `story_path`: Specific story for design (optional)
|
||||
- `design_level`: full/targeted/minimal (default: full)
|
||||
- `risk_threshold`: Score for high-priority flag (default: 6)
|
||||
- `risk_categories`: TECH,SEC,PERF,DATA,BUS,OPS (all enabled)
|
||||
- `priority_levels`: P0,P1,P2,P3 (all enabled)
|
||||
|
||||
## Outputs
|
||||
|
||||
**Primary Deliverable:**
|
||||
|
||||
**Test Design Document** (`test-design-epic-{N}.md`):
|
||||
|
||||
1. **Risk Assessment Matrix**
|
||||
- Risk ID, category, description
|
||||
- Probability (1-3) × Impact (1-3) = Score
|
||||
- Scores ≥6 flagged as high-priority
|
||||
- Mitigation plans with owners and timelines
|
||||
|
||||
2. **Coverage Matrix**
|
||||
- Requirement → Test Level (E2E/API/Component/Unit)
|
||||
- Priority assignment (P0-P3)
|
||||
- Risk linkage
|
||||
- Test count estimates
|
||||
|
||||
3. **Execution Order**
|
||||
- Smoke tests (P0 subset, <5 min)
|
||||
- P0 tests (critical paths, <10 min)
|
||||
- P1 tests (important features, <30 min)
|
||||
- P2/P3 tests (full regression, <60 min)
|
||||
|
||||
4. **Resource Estimates**
|
||||
- Hours per priority level
|
||||
- Total effort in days
|
||||
- Tooling and data prerequisites
|
||||
|
||||
5. **Quality Gate Criteria**
|
||||
- P0 pass rate: 100%
|
||||
- P1 pass rate: ≥95%
|
||||
- High-risk mitigations: 100%
|
||||
- Coverage target: ≥80%
|
||||
|
||||
## Key Features
|
||||
|
||||
### Risk Scoring Framework
|
||||
|
||||
**Probability × Impact = Risk Score**
|
||||
|
||||
**Probability** (1-3):
|
||||
|
||||
- 1 (Unlikely): <10% chance
|
||||
- 2 (Possible): 10-50% chance
|
||||
- 3 (Likely): >50% chance
|
||||
|
||||
**Impact** (1-3):
|
||||
|
||||
- 1 (Minor): Cosmetic, workaround exists
|
||||
- 2 (Degraded): Feature impaired, difficult workaround
|
||||
- 3 (Critical): System failure, no workaround
|
||||
|
||||
**Scores**:
|
||||
|
||||
- 1-2: Low risk (monitor)
|
||||
- 3-4: Medium risk (plan mitigation)
|
||||
- **6-9: High risk** (immediate mitigation required)
|
||||
|
||||
### Risk Categories (6 types)
|
||||
|
||||
**TECH** (Technical/Architecture):
|
||||
|
||||
- Architecture flaws, integration failures
|
||||
- Scalability issues, technical debt
|
||||
|
||||
**SEC** (Security):
|
||||
|
||||
- Missing access controls, auth bypass
|
||||
- Data exposure, injection vulnerabilities
|
||||
|
||||
**PERF** (Performance):
|
||||
|
||||
- SLA violations, response time degradation
|
||||
- Resource exhaustion, scalability limits
|
||||
|
||||
**DATA** (Data Integrity):
|
||||
|
||||
- Data loss/corruption, inconsistent state
|
||||
- Migration failures
|
||||
|
||||
**BUS** (Business Impact):
|
||||
|
||||
- UX degradation, business logic errors
|
||||
- Revenue impact, compliance violations
|
||||
|
||||
**OPS** (Operations):
|
||||
|
||||
- Deployment failures, configuration errors
|
||||
- Monitoring gaps, rollback issues
|
||||
|
||||
### Priority Classification (P0-P3)
|
||||
|
||||
**P0 (Critical)** - Run on every commit:
|
||||
|
||||
- Blocks core user journey
|
||||
- High-risk (score ≥6)
|
||||
- Revenue-impacting or security-critical
|
||||
|
||||
**P1 (High)** - Run on PR to main:
|
||||
|
||||
- Important user features
|
||||
- Medium-risk (score 3-4)
|
||||
- Common workflows
|
||||
|
||||
**P2 (Medium)** - Run nightly/weekly:
|
||||
|
||||
- Secondary features
|
||||
- Low-risk (score 1-2)
|
||||
- Edge cases
|
||||
|
||||
**P3 (Low)** - Run on-demand:
|
||||
|
||||
- Nice-to-have, exploratory
|
||||
- Performance benchmarks
|
||||
|
||||
### Test Level Selection
|
||||
|
||||
**E2E (End-to-End)**:
|
||||
|
||||
- Critical user journeys
|
||||
- Multi-system integration
|
||||
- Highest confidence, slowest
|
||||
|
||||
**API (Integration)**:
|
||||
|
||||
- Service contracts
|
||||
- Business logic validation
|
||||
- Fast feedback, stable
|
||||
|
||||
**Component**:
|
||||
|
||||
- UI component behavior
|
||||
- Visual regression
|
||||
- Fast, isolated
|
||||
|
||||
**Unit**:
|
||||
|
||||
- Business logic, edge cases
|
||||
- Error handling
|
||||
- Fastest, most granular
|
||||
|
||||
**Key principle**: Avoid duplicate coverage - don't test same behavior at multiple levels.
|
||||
|
||||
### Exploratory Mode (NEW - Phase 2.5)
|
||||
|
||||
**test-design** supports UI exploration for brownfield applications with missing documentation.
|
||||
|
||||
**Activation**: Automatic when requirements missing/incomplete for brownfield apps
|
||||
|
||||
- If config.tea_use_mcp_enhancements is true + MCP available → MCP-assisted exploration
|
||||
- Otherwise → Manual exploration with user documentation
|
||||
|
||||
**When to Use Exploratory Mode:**
|
||||
|
||||
- ✅ Brownfield projects with missing documentation
|
||||
- ✅ Legacy systems lacking requirements
|
||||
- ✅ Undocumented features needing test coverage
|
||||
- ✅ Unknown user journeys requiring discovery
|
||||
- ❌ NOT for greenfield projects with clear requirements
|
||||
|
||||
**Exploration Modes:**
|
||||
|
||||
1. **MCP-Assisted Exploration** (if Playwright MCP available):
|
||||
- Interactive browser exploration using MCP tools
|
||||
- `planner_setup_page` - Initialize browser
|
||||
- `browser_navigate` - Explore pages
|
||||
- `browser_click` - Interact with UI elements
|
||||
- `browser_hover` - Reveal hidden menus
|
||||
- `browser_snapshot` - Capture state at each step
|
||||
- `browser_screenshot` - Document visually
|
||||
- `browser_console_messages` - Find JavaScript errors
|
||||
- `browser_network_requests` - Identify API endpoints
|
||||
|
||||
2. **Manual Exploration** (fallback without MCP):
|
||||
- User explores application manually
|
||||
- Documents findings in markdown:
|
||||
- Pages/features discovered
|
||||
- User journeys identified
|
||||
- API endpoints observed (DevTools Network)
|
||||
- JavaScript errors noted (DevTools Console)
|
||||
- Critical workflows mapped
|
||||
- Provides exploration findings to workflow
|
||||
|
||||
**Exploration Workflow:**
|
||||
|
||||
```
|
||||
1. Enable exploratory_mode and set exploration_url
|
||||
2. IF MCP available:
|
||||
- Use planner_setup_page to init browser
|
||||
- Explore UI with browser_* tools
|
||||
- Capture snapshots and screenshots
|
||||
- Monitor console and network
|
||||
- Document discoveries
|
||||
3. IF MCP unavailable:
|
||||
- Notify user to explore manually
|
||||
- Wait for exploration findings
|
||||
4. Convert discoveries to testable requirements
|
||||
5. Continue with standard risk assessment (Step 2)
|
||||
```
|
||||
|
||||
**Example Output from Exploratory Mode:**
|
||||
|
||||
```markdown
|
||||
## Exploration Findings - Legacy Admin Panel
|
||||
|
||||
**Exploration URL**: https://admin.example.com
|
||||
**Mode**: MCP-Assisted
|
||||
|
||||
### Discovered Features:
|
||||
|
||||
1. User Management (/admin/users)
|
||||
- List users (table with 10 columns)
|
||||
- Edit user (modal form)
|
||||
- Delete user (confirmation dialog)
|
||||
- Export to CSV (download button)
|
||||
|
||||
2. Reporting Dashboard (/admin/reports)
|
||||
- Date range picker
|
||||
- Filter by department
|
||||
- Generate PDF report
|
||||
- Email report to stakeholders
|
||||
|
||||
3. API Endpoints Discovered:
|
||||
- GET /api/admin/users
|
||||
- PUT /api/admin/users/:id
|
||||
- DELETE /api/admin/users/:id
|
||||
- POST /api/reports/generate
|
||||
|
||||
### User Journeys Mapped:
|
||||
|
||||
1. Admin deletes inactive user
|
||||
- Navigate to /admin/users
|
||||
- Click delete icon
|
||||
- Confirm in modal
|
||||
- User removed from table
|
||||
|
||||
2. Admin generates monthly report
|
||||
- Navigate to /admin/reports
|
||||
- Select date range (last month)
|
||||
- Click generate
|
||||
- Download PDF
|
||||
|
||||
### Risks Identified (from exploration):
|
||||
|
||||
- R-001 (SEC): No RBAC check observed (any admin can delete any user)
|
||||
- R-002 (DATA): No confirmation on bulk delete
|
||||
- R-003 (PERF): User table loads slowly (5s for 1000 rows)
|
||||
|
||||
**Next**: Proceed to risk assessment with discovered requirements
|
||||
```
|
||||
|
||||
**Graceful Degradation:**
|
||||
|
||||
- Exploratory mode is OPTIONAL (default: disabled)
|
||||
- Works without Playwright MCP (manual fallback)
|
||||
- If exploration fails, can disable mode and provide requirements documentation
|
||||
- Seamlessly transitions to standard risk assessment workflow
|
||||
|
||||
### Knowledge Base Integration
|
||||
|
||||
Automatically consults TEA knowledge base:
|
||||
|
||||
- `risk-governance.md` - Risk classification framework
|
||||
- `probability-impact.md` - Risk scoring methodology
|
||||
- `test-levels-framework.md` - Test level selection
|
||||
- `test-priorities-matrix.md` - P0-P3 prioritization
|
||||
|
||||
## Integration with Other Workflows
|
||||
|
||||
**Before test-design:**
|
||||
|
||||
- **prd** (Phase 2): Creates PRD and epics
|
||||
- **architecture** (Phase 3): Defines technical approach
|
||||
- **tech-spec** (Phase 3): Implementation details
|
||||
|
||||
**After test-design:**
|
||||
|
||||
- **atdd**: Generate failing tests for P0 scenarios
|
||||
- **automate**: Expand coverage for P1/P2 scenarios
|
||||
- **trace (Phase 2)**: Use quality gate criteria for release decisions
|
||||
|
||||
**Coordinates with:**
|
||||
|
||||
- **framework**: Test infrastructure must exist
|
||||
- **ci**: Execution order maps to CI stages
|
||||
|
||||
**Updates:**
|
||||
|
||||
- `bmm-workflow-status.md`: Adds test design to Quality & Testing Progress
|
||||
|
||||
## Important Notes
|
||||
|
||||
### Evidence-Based Assessment
|
||||
|
||||
**Critical principle**: Base risk assessment on **evidence**, not speculation.
|
||||
|
||||
**Evidence sources:**
|
||||
|
||||
- PRD and user research
|
||||
- Architecture documentation
|
||||
- Historical bug data
|
||||
- User feedback
|
||||
- Security audit results
|
||||
|
||||
**When uncertain**: Document assumptions, request user clarification.
|
||||
|
||||
**Avoid**:
|
||||
|
||||
- Guessing business impact
|
||||
- Assuming user behavior
|
||||
- Inventing requirements
|
||||
|
||||
### Resource Estimation Formula
|
||||
|
||||
```
|
||||
P0: 2 hours per test (setup + complex scenarios)
|
||||
P1: 1 hour per test (standard coverage)
|
||||
P2: 0.5 hours per test (simple scenarios)
|
||||
P3: 0.25 hours per test (exploratory)
|
||||
|
||||
Total Days = Total Hours / 8
|
||||
```
|
||||
|
||||
Example:
|
||||
|
||||
- 15 P0 × 2h = 30h
|
||||
- 25 P1 × 1h = 25h
|
||||
- 40 P2 × 0.5h = 20h
|
||||
- **Total: 75 hours (~10 days)**
|
||||
|
||||
### Execution Order Strategy
|
||||
|
||||
**Smoke tests** (subset of P0, <5 min):
|
||||
|
||||
- Login successful
|
||||
- Dashboard loads
|
||||
- Core API responds
|
||||
|
||||
**Purpose**: Fast feedback, catch build-breaking issues immediately.
|
||||
|
||||
**P0 tests** (critical paths, <10 min):
|
||||
|
||||
- All scenarios blocking user journeys
|
||||
- Security-critical flows
|
||||
|
||||
**P1 tests** (important features, <30 min):
|
||||
|
||||
- Common workflows
|
||||
- Medium-risk areas
|
||||
|
||||
**P2/P3 tests** (full regression, <60 min):
|
||||
|
||||
- Edge cases
|
||||
- Performance benchmarks
|
||||
|
||||
### Quality Gate Criteria
|
||||
|
||||
**Pass/Fail thresholds:**
|
||||
|
||||
- P0: 100% pass (no exceptions)
|
||||
- P1: ≥95% pass (2-3 failures acceptable with waivers)
|
||||
- P2/P3: ≥90% pass (informational)
|
||||
- High-risk items: All mitigated or have approved waivers
|
||||
|
||||
**Coverage targets:**
|
||||
|
||||
- Critical paths: ≥80%
|
||||
- Security scenarios: 100%
|
||||
- Business logic: ≥70%
|
||||
|
||||
## Validation Checklist
|
||||
|
||||
After workflow completion:
|
||||
|
||||
- [ ] Risk assessment complete (all categories)
|
||||
- [ ] Risks scored (probability × impact)
|
||||
- [ ] High-priority risks (≥6) flagged
|
||||
- [ ] Coverage matrix maps requirements to test levels
|
||||
- [ ] Priorities assigned (P0-P3)
|
||||
- [ ] Execution order defined
|
||||
- [ ] Resource estimates provided
|
||||
- [ ] Quality gate criteria defined
|
||||
- [ ] Output file created
|
||||
|
||||
Refer to `checklist.md` for comprehensive validation.
|
||||
|
||||
## Example Execution
|
||||
|
||||
**Scenario: E-commerce checkout epic**
|
||||
|
||||
```bash
|
||||
bmad tea *test-design
|
||||
# Epic 3: Checkout flow redesign
|
||||
|
||||
# Risk Assessment identifies:
|
||||
- R-001 (SEC): Payment bypass, P=2 × I=3 = 6 (HIGH)
|
||||
- R-002 (PERF): Cart load time, P=3 × I=2 = 6 (HIGH)
|
||||
- R-003 (BUS): Order confirmation email, P=2 × I=2 = 4 (MEDIUM)
|
||||
|
||||
# Coverage Plan:
|
||||
P0 scenarios: 12 tests (payment security, order creation)
|
||||
P1 scenarios: 18 tests (cart management, promo codes)
|
||||
P2 scenarios: 25 tests (edge cases, error handling)
|
||||
|
||||
Total effort: 65 hours (~8 days)
|
||||
|
||||
# Test Levels:
|
||||
- E2E: 8 tests (critical checkout path)
|
||||
- API: 30 tests (business logic, payment processing)
|
||||
- Unit: 17 tests (calculations, validations)
|
||||
|
||||
# Execution Order:
|
||||
1. Smoke: Payment successful, order created (2 min)
|
||||
2. P0: All payment & security flows (8 min)
|
||||
3. P1: Cart & promo codes (20 min)
|
||||
4. P2: Edge cases (40 min)
|
||||
|
||||
# Quality Gates:
|
||||
- P0 pass rate: 100%
|
||||
- P1 pass rate: ≥95%
|
||||
- R-001 mitigated: Add payment validation layer
|
||||
- R-002 mitigated: Implement cart caching
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**Issue: "Unable to score risks - missing context"**
|
||||
|
||||
- **Cause**: Insufficient documentation
|
||||
- **Solution**: Request PRD, architecture docs, or user clarification
|
||||
|
||||
**Issue: "All tests marked as P0"**
|
||||
|
||||
- **Cause**: Over-prioritization
|
||||
- **Solution**: Apply strict P0 criteria (blocks core journey + high risk + no workaround)
|
||||
|
||||
**Issue: "Duplicate coverage at multiple test levels"**
|
||||
|
||||
- **Cause**: Not following test pyramid
|
||||
- **Solution**: Use E2E for critical paths only, API for logic, unit for edge cases
|
||||
|
||||
**Issue: "Resource estimates too high"**
|
||||
|
||||
- **Cause**: Complex test setup or insufficient automation
|
||||
- **Solution**: Invest in fixtures/factories upfront, reduce per-test setup time
|
||||
|
||||
## Related Workflows
|
||||
|
||||
- **atdd**: Generate failing tests → [atdd/README.md](../atdd/README.md)
|
||||
- **automate**: Expand regression coverage → [automate/README.md](../automate/README.md)
|
||||
- **trace**: Traceability and quality gate decisions → [trace/README.md](../trace/README.md)
|
||||
- **framework**: Test infrastructure → [framework/README.md](../framework/README.md)
|
||||
|
||||
## Version History
|
||||
|
||||
- **v4.0 (BMad v6)**: Pure markdown instructions, risk scoring framework, template-based output
|
||||
- **v3.x**: XML format instructions
|
||||
- **v2.x**: Legacy task-based approach
|
||||
@@ -1,775 +0,0 @@
|
||||
# Test Quality Review Workflow
|
||||
|
||||
The Test Quality Review workflow performs comprehensive quality validation of test code using TEA's knowledge base of best practices. It detects flaky patterns, validates structure, and provides actionable feedback to improve test maintainability and reliability.
|
||||
|
||||
## Overview
|
||||
|
||||
This workflow reviews test quality against proven patterns from TEA's knowledge base including fixture architecture, network-first safeguards, data factories, determinism, isolation, and flakiness prevention. It generates a quality score (0-100) with detailed feedback on violations and recommendations.
|
||||
|
||||
**Key Features:**
|
||||
|
||||
- **Knowledge-Based Review**: Applies patterns from 19+ knowledge fragments in tea-index.csv
|
||||
- **Quality Scoring**: 0-100 score with letter grade (A+ to F) based on violations
|
||||
- **Multi-Scope Review**: Single file, directory, or entire test suite
|
||||
- **Pattern Detection**: Identifies hard waits, race conditions, shared state, conditionals
|
||||
- **Best Practice Validation**: BDD format, test IDs, priorities, assertions, test length
|
||||
- **Actionable Feedback**: Critical issues (must fix) vs recommendations (should fix)
|
||||
- **Code Examples**: Every issue includes recommended fix with code snippets
|
||||
- **Integration**: Works with story files, test-design, acceptance criteria context
|
||||
|
||||
---
|
||||
|
||||
## Usage
|
||||
|
||||
```bash
|
||||
bmad tea *test-review
|
||||
```
|
||||
|
||||
The TEA agent runs this workflow when:
|
||||
|
||||
- After `*atdd` workflow → validate generated acceptance tests
|
||||
- After `*automate` workflow → ensure regression suite quality
|
||||
- After developer writes tests → provide quality feedback
|
||||
- Before `*gate` workflow → confirm test quality before release
|
||||
- User explicitly requests review: `bmad tea *test-review`
|
||||
- Periodic quality audits of existing test suite
|
||||
|
||||
**Typical workflow sequence:**
|
||||
|
||||
1. `*atdd` → Generate failing acceptance tests
|
||||
2. **`*test-review`** → Validate test quality ⬅️ YOU ARE HERE (option 1)
|
||||
3. `*dev story` → Implement feature with tests passing
|
||||
4. **`*test-review`** → Review implementation tests ⬅️ YOU ARE HERE (option 2)
|
||||
5. `*automate` → Expand regression suite
|
||||
6. **`*test-review`** → Validate new regression tests ⬅️ YOU ARE HERE (option 3)
|
||||
7. `*gate` → Final quality gate decision
|
||||
|
||||
---
|
||||
|
||||
## Inputs
|
||||
|
||||
### Required Context Files
|
||||
|
||||
- **Test File(s)**: One or more test files to review (auto-discovered or explicitly provided)
|
||||
- **Test Framework Config**: playwright.config.ts, jest.config.js, etc. (for context)
|
||||
|
||||
### Recommended Context Files
|
||||
|
||||
- **Story File**: Acceptance criteria for context (e.g., `story-1.3.md`)
|
||||
- **Test Design**: Priority context (P0/P1/P2/P3) from test-design.md
|
||||
- **Knowledge Base**: tea-index.csv with best practice fragments (required for thorough review)
|
||||
|
||||
### Workflow Variables
|
||||
|
||||
Key variables that control review behavior (configured in `workflow.yaml`):
|
||||
|
||||
- **review_scope**: `single` | `directory` | `suite` (default: `single`)
|
||||
- `single`: Review one test file
|
||||
- `directory`: Review all tests in a directory
|
||||
- `suite`: Review entire test suite
|
||||
|
||||
- **quality_score_enabled**: Enable 0-100 quality scoring (default: `true`)
|
||||
- **append_to_file**: Add inline comments to test files (default: `false`)
|
||||
- **check_against_knowledge**: Use tea-index.csv fragments (default: `true`)
|
||||
- **strict_mode**: Fail on any violation vs advisory only (default: `false`)
|
||||
|
||||
**Quality Criteria Flags** (all default to `true`):
|
||||
|
||||
- `check_given_when_then`: BDD format validation
|
||||
- `check_test_ids`: Test ID conventions
|
||||
- `check_priority_markers`: P0/P1/P2/P3 classification
|
||||
- `check_hard_waits`: Detect sleep(), wait(X)
|
||||
- `check_determinism`: No conditionals/try-catch abuse
|
||||
- `check_isolation`: Tests clean up, no shared state
|
||||
- `check_fixture_patterns`: Pure function → Fixture → mergeTests
|
||||
- `check_data_factories`: Factory usage vs hardcoded data
|
||||
- `check_network_first`: Route intercept before navigate
|
||||
- `check_assertions`: Explicit assertions present
|
||||
- `check_test_length`: Warn if >300 lines
|
||||
- `check_test_duration`: Warn if >1.5 min
|
||||
- `check_flakiness_patterns`: Common flaky patterns
|
||||
|
||||
---
|
||||
|
||||
## Outputs
|
||||
|
||||
### Primary Deliverable
|
||||
|
||||
**Test Quality Review Report** (`test-review-{filename}.md`):
|
||||
|
||||
- **Executive Summary**: Overall assessment, key strengths/weaknesses, recommendation
|
||||
- **Quality Score**: 0-100 score with letter grade (A+ to F)
|
||||
- **Quality Criteria Assessment**: Table with all criteria evaluated (PASS/WARN/FAIL)
|
||||
- **Critical Issues**: P0/P1 violations that must be fixed
|
||||
- **Recommendations**: P2/P3 violations that should be fixed
|
||||
- **Best Practices Examples**: Good patterns found in tests
|
||||
- **Knowledge Base References**: Links to detailed guidance
|
||||
|
||||
Each issue includes:
|
||||
|
||||
- Code location (file:line)
|
||||
- Explanation of problem
|
||||
- Recommended fix with code example
|
||||
- Knowledge base fragment reference
|
||||
|
||||
### Secondary Outputs
|
||||
|
||||
- **Inline Comments**: TODO comments in test files at violation locations (if enabled)
|
||||
- **Quality Badge**: Badge with score (e.g., "Test Quality: 87/100 (A)")
|
||||
- **Story Update**: Test quality section appended to story file (if enabled)
|
||||
|
||||
### Validation Safeguards
|
||||
|
||||
- ✅ All knowledge base fragments loaded successfully
|
||||
- ✅ Test files parsed and structure analyzed
|
||||
- ✅ All enabled quality criteria evaluated
|
||||
- ✅ Violations categorized by severity (P0/P1/P2/P3)
|
||||
- ✅ Quality score calculated with breakdown
|
||||
- ✅ Actionable feedback with code examples provided
|
||||
|
||||
---
|
||||
|
||||
## Quality Criteria Explained
|
||||
|
||||
### 1. BDD Format (Given-When-Then)
|
||||
|
||||
**PASS**: Tests use clear Given-When-Then structure
|
||||
|
||||
```typescript
|
||||
// Given: User is logged in
|
||||
const user = await createTestUser();
|
||||
await loginPage.login(user.email, user.password);
|
||||
|
||||
// When: User navigates to dashboard
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// Then: User sees welcome message
|
||||
await expect(page.locator('[data-testid="welcome"]')).toContainText(user.name);
|
||||
```
|
||||
|
||||
**FAIL**: Tests lack structure, hard to understand intent
|
||||
|
||||
```typescript
|
||||
await page.goto('/dashboard');
|
||||
await page.click('.button');
|
||||
await expect(page.locator('.text')).toBeVisible();
|
||||
```
|
||||
|
||||
**Knowledge**: test-quality.md, tdd-cycles.md
|
||||
|
||||
---
|
||||
|
||||
### 2. Test IDs
|
||||
|
||||
**PASS**: All tests have IDs following convention
|
||||
|
||||
```typescript
|
||||
test.describe('1.3-E2E-001: User Login Flow', () => {
|
||||
test('should log in successfully with valid credentials', async ({ page }) => {
|
||||
// Test implementation
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**FAIL**: No test IDs, can't trace to requirements
|
||||
|
||||
```typescript
|
||||
test.describe('Login', () => {
|
||||
test('login works', async ({ page }) => {
|
||||
// Test implementation
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Knowledge**: traceability.md, test-quality.md
|
||||
|
||||
---
|
||||
|
||||
### 3. Priority Markers
|
||||
|
||||
**PASS**: Tests classified as P0/P1/P2/P3
|
||||
|
||||
```typescript
|
||||
test.describe('P0: Critical User Journey - Checkout', () => {
|
||||
// Critical tests
|
||||
});
|
||||
|
||||
test.describe('P2: Edge Case - International Addresses', () => {
|
||||
// Nice-to-have tests
|
||||
});
|
||||
```
|
||||
|
||||
**Knowledge**: test-priorities.md, risk-governance.md
|
||||
|
||||
---
|
||||
|
||||
### 4. No Hard Waits
|
||||
|
||||
**PASS**: No sleep(), wait(), hardcoded delays
|
||||
|
||||
```typescript
|
||||
// ✅ Good: Explicit wait for condition
|
||||
await expect(page.locator('[data-testid="user-menu"]')).toBeVisible({ timeout: 10000 });
|
||||
```
|
||||
|
||||
**FAIL**: Hard waits introduce flakiness
|
||||
|
||||
```typescript
|
||||
// ❌ Bad: Hard wait
|
||||
await page.waitForTimeout(2000);
|
||||
await expect(page.locator('[data-testid="user-menu"]')).toBeVisible();
|
||||
```
|
||||
|
||||
**Knowledge**: test-quality.md, network-first.md
|
||||
|
||||
---
|
||||
|
||||
### 5. Determinism
|
||||
|
||||
**PASS**: Tests work deterministically, no conditionals
|
||||
|
||||
```typescript
|
||||
// ✅ Good: Deterministic test
|
||||
await expect(page.locator('[data-testid="status"]')).toHaveText('Active');
|
||||
```
|
||||
|
||||
**FAIL**: Conditionals make tests unpredictable
|
||||
|
||||
```typescript
|
||||
// ❌ Bad: Conditional logic
|
||||
const status = await page.locator('[data-testid="status"]').textContent();
|
||||
if (status === 'Active') {
|
||||
await page.click('[data-testid="deactivate"]');
|
||||
} else {
|
||||
await page.click('[data-testid="activate"]');
|
||||
}
|
||||
```
|
||||
|
||||
**Knowledge**: test-quality.md, data-factories.md
|
||||
|
||||
---
|
||||
|
||||
### 6. Isolation
|
||||
|
||||
**PASS**: Tests clean up, no shared state
|
||||
|
||||
```typescript
|
||||
test.afterEach(async ({ page, testUser }) => {
|
||||
// Cleanup: Delete test user
|
||||
await api.deleteUser(testUser.id);
|
||||
});
|
||||
```
|
||||
|
||||
**FAIL**: Shared state, tests depend on order
|
||||
|
||||
```typescript
|
||||
// ❌ Bad: Shared global variable
|
||||
let userId: string;
|
||||
|
||||
test('create user', async () => {
|
||||
userId = await createUser(); // Sets global
|
||||
});
|
||||
|
||||
test('update user', async () => {
|
||||
await updateUser(userId); // Depends on previous test
|
||||
});
|
||||
```
|
||||
|
||||
**Knowledge**: test-quality.md, data-factories.md
|
||||
|
||||
---
|
||||
|
||||
### 7. Fixture Patterns
|
||||
|
||||
**PASS**: Pure function → Fixture → mergeTests
|
||||
|
||||
```typescript
|
||||
// ✅ Good: Pure function fixture
|
||||
const createAuthenticatedPage = async (page: Page, user: User) => {
|
||||
await loginPage.login(user.email, user.password);
|
||||
return page;
|
||||
};
|
||||
|
||||
const test = base.extend({
|
||||
authenticatedPage: async ({ page }, use) => {
|
||||
const user = createTestUser();
|
||||
const authedPage = await createAuthenticatedPage(page, user);
|
||||
await use(authedPage);
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
**FAIL**: No fixtures, repeated setup
|
||||
|
||||
```typescript
|
||||
// ❌ Bad: Repeated setup in every test
|
||||
test('test 1', async ({ page }) => {
|
||||
await page.goto('/login');
|
||||
await page.fill('[name="email"]', 'test@example.com');
|
||||
await page.fill('[name="password"]', 'password123');
|
||||
await page.click('[type="submit"]');
|
||||
// Test logic
|
||||
});
|
||||
```
|
||||
|
||||
**Knowledge**: fixture-architecture.md
|
||||
|
||||
---
|
||||
|
||||
### 8. Data Factories
|
||||
|
||||
**PASS**: Factory functions with overrides
|
||||
|
||||
```typescript
|
||||
// ✅ Good: Factory function
|
||||
import { createTestUser } from './factories/user-factory';
|
||||
|
||||
test('user can update profile', async ({ page }) => {
|
||||
const user = createTestUser({ role: 'admin' });
|
||||
await api.createUser(user); // API-first setup
|
||||
// Test UI interaction
|
||||
});
|
||||
```
|
||||
|
||||
**FAIL**: Hardcoded test data
|
||||
|
||||
```typescript
|
||||
// ❌ Bad: Magic strings
|
||||
await page.fill('[name="email"]', 'test@example.com');
|
||||
await page.fill('[name="phone"]', '555-1234');
|
||||
```
|
||||
|
||||
**Knowledge**: data-factories.md
|
||||
|
||||
---
|
||||
|
||||
### 9. Network-First Pattern
|
||||
|
||||
**PASS**: Route intercept before navigate
|
||||
|
||||
```typescript
|
||||
// ✅ Good: Intercept before navigation
|
||||
await page.route('**/api/users', (route) => route.fulfill({ json: mockUsers }));
|
||||
await page.goto('/users'); // Navigate after route setup
|
||||
```
|
||||
|
||||
**FAIL**: Race condition risk
|
||||
|
||||
```typescript
|
||||
// ❌ Bad: Navigate before intercept
|
||||
await page.goto('/users');
|
||||
await page.route('**/api/users', (route) => route.fulfill({ json: mockUsers })); // Too late!
|
||||
```
|
||||
|
||||
**Knowledge**: network-first.md
|
||||
|
||||
---
|
||||
|
||||
### 10. Explicit Assertions
|
||||
|
||||
**PASS**: Clear, specific assertions
|
||||
|
||||
```typescript
|
||||
await expect(page.locator('[data-testid="username"]')).toHaveText('John Doe');
|
||||
await expect(page.locator('[data-testid="status"]')).toHaveClass(/active/);
|
||||
```
|
||||
|
||||
**FAIL**: Missing or vague assertions
|
||||
|
||||
```typescript
|
||||
await page.locator('[data-testid="username"]').isVisible(); // No assertion!
|
||||
```
|
||||
|
||||
**Knowledge**: test-quality.md
|
||||
|
||||
---
|
||||
|
||||
### 11. Test Length
|
||||
|
||||
**PASS**: ≤300 lines per file (ideal: ≤200)
|
||||
**WARN**: 301-500 lines (consider splitting)
|
||||
**FAIL**: >500 lines (too large)
|
||||
|
||||
**Knowledge**: test-quality.md
|
||||
|
||||
---
|
||||
|
||||
### 12. Test Duration
|
||||
|
||||
**PASS**: ≤1.5 minutes per test (target: <30 seconds)
|
||||
**WARN**: 1.5-3 minutes (consider optimization)
|
||||
**FAIL**: >3 minutes (too slow)
|
||||
|
||||
**Knowledge**: test-quality.md, selective-testing.md
|
||||
|
||||
---
|
||||
|
||||
### 13. Flakiness Patterns
|
||||
|
||||
Common flaky patterns detected:
|
||||
|
||||
- Tight timeouts (e.g., `{ timeout: 1000 }`)
|
||||
- Race conditions (navigation before route interception)
|
||||
- Timing-dependent assertions
|
||||
- Retry logic hiding flakiness
|
||||
- Environment-dependent assumptions
|
||||
|
||||
**Knowledge**: test-quality.md, network-first.md, ci-burn-in.md
|
||||
|
||||
---
|
||||
|
||||
## Quality Scoring
|
||||
|
||||
### Score Calculation
|
||||
|
||||
```
|
||||
Starting Score: 100
|
||||
|
||||
Deductions:
|
||||
- Critical Violations (P0): -10 points each
|
||||
- High Violations (P1): -5 points each
|
||||
- Medium Violations (P2): -2 points each
|
||||
- Low Violations (P3): -1 point each
|
||||
|
||||
Bonus Points (max +30):
|
||||
+ Excellent BDD structure: +5
|
||||
+ Comprehensive fixtures: +5
|
||||
+ Comprehensive data factories: +5
|
||||
+ Network-first pattern consistently used: +5
|
||||
+ Perfect isolation (all tests clean up): +5
|
||||
+ All test IDs present and correct: +5
|
||||
|
||||
Final Score: max(0, min(100, Starting Score - Violations + Bonus))
|
||||
```
|
||||
|
||||
### Quality Grades
|
||||
|
||||
- **90-100** (A+): Excellent - Production-ready, best practices followed
|
||||
- **80-89** (A): Good - Minor improvements recommended
|
||||
- **70-79** (B): Acceptable - Some issues to address
|
||||
- **60-69** (C): Needs Improvement - Several issues detected
|
||||
- **<60** (F): Critical Issues - Significant problems, not production-ready
|
||||
|
||||
---
|
||||
|
||||
## Example Scenarios
|
||||
|
||||
### Scenario 1: Excellent Quality (Score: 95)
|
||||
|
||||
```markdown
|
||||
# Test Quality Review: checkout-flow.spec.ts
|
||||
|
||||
**Quality Score**: 95/100 (A+ - Excellent)
|
||||
**Recommendation**: Approve - Production Ready
|
||||
|
||||
## Executive Summary
|
||||
|
||||
Excellent test quality with comprehensive coverage and best practices throughout.
|
||||
Tests demonstrate expert-level patterns including fixture architecture, data
|
||||
factories, network-first approach, and perfect isolation.
|
||||
|
||||
**Strengths:**
|
||||
✅ Clear Given-When-Then structure in all tests
|
||||
✅ Comprehensive fixtures for authenticated states
|
||||
✅ Data factories with faker.js for realistic test data
|
||||
✅ Network-first pattern prevents race conditions
|
||||
✅ Perfect test isolation with cleanup
|
||||
✅ All test IDs present (1.2-E2E-001 through 1.2-E2E-005)
|
||||
|
||||
**Minor Recommendations:**
|
||||
⚠️ One test slightly verbose (245 lines) - consider extracting helper function
|
||||
|
||||
**Recommendation**: Approve without changes. Use as reference for other tests.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Scenario 2: Good Quality (Score: 82)
|
||||
|
||||
```markdown
|
||||
# Test Quality Review: user-profile.spec.ts
|
||||
|
||||
**Quality Score**: 82/100 (A - Good)
|
||||
**Recommendation**: Approve with Comments
|
||||
|
||||
## Executive Summary
|
||||
|
||||
Solid test quality with good structure and coverage. A few improvements would
|
||||
enhance maintainability and reduce flakiness risk.
|
||||
|
||||
**Strengths:**
|
||||
✅ Good BDD structure
|
||||
✅ Test IDs present
|
||||
✅ Explicit assertions
|
||||
|
||||
**Issues to Address:**
|
||||
⚠️ 2 hard waits detected (lines 34, 67) - use explicit waits instead
|
||||
⚠️ Hardcoded test data (line 23) - use factory functions
|
||||
⚠️ Missing cleanup in one test (line 89) - add afterEach hook
|
||||
|
||||
**Recommendation**: Address hard waits before merging. Other improvements
|
||||
can be addressed in follow-up PR.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Scenario 3: Needs Improvement (Score: 68)
|
||||
|
||||
```markdown
|
||||
# Test Quality Review: legacy-report.spec.ts
|
||||
|
||||
**Quality Score**: 68/100 (C - Needs Improvement)
|
||||
**Recommendation**: Request Changes
|
||||
|
||||
## Executive Summary
|
||||
|
||||
Test has several quality issues that should be addressed before merging.
|
||||
Primarily concerns around flakiness risk and maintainability.
|
||||
|
||||
**Critical Issues:**
|
||||
❌ 5 hard waits detected (flakiness risk)
|
||||
❌ Race condition: navigation before route interception (line 45)
|
||||
❌ Shared global state between tests (line 12)
|
||||
❌ Missing test IDs (can't trace to requirements)
|
||||
|
||||
**Recommendations:**
|
||||
⚠️ Test file is 487 lines - consider splitting
|
||||
⚠️ Hardcoded data throughout - use factories
|
||||
⚠️ Missing cleanup in afterEach
|
||||
|
||||
**Recommendation**: Address all critical issues (❌) before re-review.
|
||||
Significant refactoring needed.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Scenario 4: Critical Issues (Score: 42)
|
||||
|
||||
```markdown
|
||||
# Test Quality Review: data-export.spec.ts
|
||||
|
||||
**Quality Score**: 42/100 (F - Critical Issues)
|
||||
**Recommendation**: Block - Not Production Ready
|
||||
|
||||
## Executive Summary
|
||||
|
||||
CRITICAL: Test has severe quality issues that make it unsuitable for
|
||||
production. Significant refactoring required.
|
||||
|
||||
**Critical Issues:**
|
||||
❌ 12 hard waits (page.waitForTimeout) throughout
|
||||
❌ No test IDs or structure
|
||||
❌ Try/catch blocks swallowing errors (lines 23, 45, 67, 89)
|
||||
❌ No cleanup - tests leave data in database
|
||||
❌ Conditional logic (if/else) throughout tests
|
||||
❌ No assertions in 3 tests (tests do nothing!)
|
||||
❌ 687 lines - far too large
|
||||
❌ Multiple race conditions
|
||||
❌ Hardcoded credentials in plain text (SECURITY ISSUE)
|
||||
|
||||
**Recommendation**: BLOCK MERGE. Complete rewrite recommended following
|
||||
TEA knowledge base patterns. Suggest pairing session with QA engineer.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Integration with Other Workflows
|
||||
|
||||
### Before Test Review
|
||||
|
||||
1. **atdd** - Generates acceptance tests → TEA reviews for quality
|
||||
2. **dev story** - Developer implements tests → TEA provides feedback
|
||||
3. **automate** - Expands regression suite → TEA validates new tests
|
||||
|
||||
### After Test Review
|
||||
|
||||
1. **Developer** - Addresses critical issues, improves based on recommendations
|
||||
2. **gate** - Test quality feeds into release decision (high-quality tests increase confidence)
|
||||
|
||||
### Coordinates With
|
||||
|
||||
- **Story File**: Review links to acceptance criteria for context
|
||||
- **Test Design**: Review validates tests align with P0/P1/P2/P3 prioritization
|
||||
- **Knowledge Base**: All feedback references tea-index.csv fragments
|
||||
|
||||
---
|
||||
|
||||
## Review Scopes
|
||||
|
||||
### Single File Review
|
||||
|
||||
```bash
|
||||
# Review specific test file
|
||||
bmad tea *test-review
|
||||
# Provide test_file_path when prompted: tests/auth/login.spec.ts
|
||||
```
|
||||
|
||||
**Use When:**
|
||||
|
||||
- Reviewing tests just written
|
||||
- PR review of specific test file
|
||||
- Debugging flaky test
|
||||
- Learning test quality patterns
|
||||
|
||||
---
|
||||
|
||||
### Directory Review
|
||||
|
||||
```bash
|
||||
# Review all tests in directory
|
||||
bmad tea *test-review
|
||||
# Provide review_scope: directory
|
||||
# Provide test_dir: tests/auth/
|
||||
```
|
||||
|
||||
**Use When:**
|
||||
|
||||
- Feature branch has multiple test files
|
||||
- Reviewing entire feature test suite
|
||||
- Auditing test quality for module
|
||||
|
||||
---
|
||||
|
||||
### Suite Review
|
||||
|
||||
```bash
|
||||
# Review entire test suite
|
||||
bmad tea *test-review
|
||||
# Provide review_scope: suite
|
||||
```
|
||||
|
||||
**Use When:**
|
||||
|
||||
- Periodic quality audit (monthly/quarterly)
|
||||
- Before major release
|
||||
- Identifying patterns across codebase
|
||||
- Establishing quality baseline
|
||||
|
||||
---
|
||||
|
||||
## Configuration Examples
|
||||
|
||||
### Strict Review (Fail on Violations)
|
||||
|
||||
```yaml
|
||||
review_scope: 'single'
|
||||
quality_score_enabled: true
|
||||
strict_mode: true # Fail if score <70
|
||||
check_against_knowledge: true
|
||||
# All check_* flags: true
|
||||
```
|
||||
|
||||
Use for: PR gates, production releases
|
||||
|
||||
---
|
||||
|
||||
### Balanced Review (Advisory)
|
||||
|
||||
```yaml
|
||||
review_scope: 'single'
|
||||
quality_score_enabled: true
|
||||
strict_mode: false # Advisory only
|
||||
check_against_knowledge: true
|
||||
# All check_* flags: true
|
||||
```
|
||||
|
||||
Use for: Most development workflows (default)
|
||||
|
||||
---
|
||||
|
||||
### Focused Review (Specific Criteria)
|
||||
|
||||
```yaml
|
||||
review_scope: 'single'
|
||||
check_hard_waits: true
|
||||
check_flakiness_patterns: true
|
||||
check_network_first: true
|
||||
# Other checks: false
|
||||
```
|
||||
|
||||
Use for: Debugging flaky tests, targeted improvements
|
||||
|
||||
---
|
||||
|
||||
## Important Notes
|
||||
|
||||
1. **Non-Prescriptive**: Review provides guidance, not rigid rules
|
||||
2. **Context Matters**: Some violations may be justified (document with comments)
|
||||
3. **Knowledge-Based**: All feedback grounded in proven patterns
|
||||
4. **Actionable**: Every issue includes recommended fix with code example
|
||||
5. **Quality Score**: Use as indicator, not absolute measure
|
||||
6. **Continuous Improvement**: Review tests periodically as patterns evolve
|
||||
7. **Learning Tool**: Use reviews to learn best practices, not just find bugs
|
||||
|
||||
---
|
||||
|
||||
## Knowledge Base References
|
||||
|
||||
This workflow automatically consults:
|
||||
|
||||
- **test-quality.md** - Definition of Done (no hard waits, <300 lines, <1.5 min, self-cleaning)
|
||||
- **fixture-architecture.md** - Pure function → Fixture → mergeTests pattern
|
||||
- **network-first.md** - Route intercept before navigate (race condition prevention)
|
||||
- **data-factories.md** - Factory functions with overrides, API-first setup
|
||||
- **test-levels-framework.md** - E2E vs API vs Component vs Unit appropriateness
|
||||
- **playwright-config.md** - Environment-based configuration patterns
|
||||
- **tdd-cycles.md** - Red-Green-Refactor patterns
|
||||
- **selective-testing.md** - Duplicate coverage detection
|
||||
- **ci-burn-in.md** - Flakiness detection patterns
|
||||
- **test-priorities.md** - P0/P1/P2/P3 classification framework
|
||||
- **traceability.md** - Requirements-to-tests mapping
|
||||
|
||||
See `tea-index.csv` for complete knowledge fragment mapping.
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Problem: Quality score seems too low
|
||||
|
||||
**Solution:**
|
||||
|
||||
- Review violation breakdown - focus on critical issues first
|
||||
- Consider project context - some patterns may be justified
|
||||
- Check if criteria are appropriate for project type
|
||||
- Score is indicator, not absolute - focus on actionable feedback
|
||||
|
||||
---
|
||||
|
||||
### Problem: No test files found
|
||||
|
||||
**Solution:**
|
||||
|
||||
- Verify test_dir path is correct
|
||||
- Check test file extensions (_.spec.ts, _.test.js, etc.)
|
||||
- Use glob pattern to discover: `tests/**/*.spec.ts`
|
||||
|
||||
---
|
||||
|
||||
### Problem: Knowledge fragments not loading
|
||||
|
||||
**Solution:**
|
||||
|
||||
- Verify tea-index.csv exists in testarch/ directory
|
||||
- Check fragment file paths are correct in tea-index.csv
|
||||
- Ensure auto_load_knowledge: true in workflow variables
|
||||
|
||||
---
|
||||
|
||||
### Problem: Too many false positives
|
||||
|
||||
**Solution:**
|
||||
|
||||
- Add justification comments in code for legitimate violations
|
||||
- Adjust check\_\* flags to disable specific criteria
|
||||
- Use strict_mode: false for advisory-only feedback
|
||||
- Context matters - document why pattern is appropriate
|
||||
|
||||
---
|
||||
|
||||
## Related Commands
|
||||
|
||||
- `bmad tea *atdd` - Generate acceptance tests (review after generation)
|
||||
- `bmad tea *automate` - Expand regression suite (review new tests)
|
||||
- `bmad tea *gate` - Quality gate decision (test quality feeds into decision)
|
||||
- `bmad dev story` - Implement story (review tests after implementation)
|
||||
@@ -1,802 +0,0 @@
|
||||
# Requirements Traceability & Quality Gate Workflow
|
||||
|
||||
**Workflow ID:** `testarch-trace`
|
||||
**Agent:** Test Architect (TEA)
|
||||
**Command:** `bmad tea *trace`
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
The **trace** workflow operates in two sequential phases to validate test coverage and deployment readiness:
|
||||
|
||||
**PHASE 1 - REQUIREMENTS TRACEABILITY:** Generates comprehensive requirements-to-tests traceability matrix that maps acceptance criteria to implemented tests, identifies coverage gaps, and provides actionable recommendations.
|
||||
|
||||
**PHASE 2 - QUALITY GATE DECISION:** Makes deterministic release decisions (PASS/CONCERNS/FAIL/WAIVED) based on traceability results, test execution evidence, and non-functional requirements validation.
|
||||
|
||||
**Key Features:**
|
||||
|
||||
- Maps acceptance criteria to specific test cases across all levels (E2E, API, Component, Unit)
|
||||
- Classifies coverage status (FULL, PARTIAL, NONE, UNIT-ONLY, INTEGRATION-ONLY)
|
||||
- Prioritizes gaps by risk level (P0/P1/P2/P3)
|
||||
- Applies deterministic decision rules for deployment readiness
|
||||
- Generates gate decisions with evidence and rationale
|
||||
- Supports waivers for business-approved exceptions
|
||||
- Updates workflow status and notifies stakeholders
|
||||
- Creates CI/CD-ready YAML snippets for quality gates
|
||||
- Detects duplicate coverage across test levels
|
||||
- Verifies test quality (assertions, structure, performance)
|
||||
|
||||
---
|
||||
|
||||
## When to Use This Workflow
|
||||
|
||||
Use `*trace` when you need to:
|
||||
|
||||
### Phase 1 - Traceability
|
||||
|
||||
- ✅ Validate that all acceptance criteria have test coverage
|
||||
- ✅ Identify coverage gaps before release or PR merge
|
||||
- ✅ Generate traceability documentation for compliance or audits
|
||||
- ✅ Ensure critical paths (P0/P1) are fully tested
|
||||
- ✅ Detect duplicate coverage across test levels
|
||||
- ✅ Assess test quality across your suite
|
||||
|
||||
### Phase 2 - Gate Decision (Optional)
|
||||
|
||||
- ✅ Make final go/no-go deployment decision
|
||||
- ✅ Validate test execution results against thresholds
|
||||
- ✅ Evaluate non-functional requirements (security, performance)
|
||||
- ✅ Generate audit trail for release approval
|
||||
- ✅ Handle business waivers for critical deadlines
|
||||
- ✅ Notify stakeholders of gate decision
|
||||
|
||||
**Typical Timing:**
|
||||
|
||||
- After tests are implemented (post-ATDD or post-development)
|
||||
- Before merging a PR (validate P0/P1 coverage)
|
||||
- Before release (validate full coverage and make gate decision)
|
||||
- During sprint retrospectives (assess test quality)
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### Phase 1 - Traceability (Required)
|
||||
|
||||
- Acceptance criteria (from story file OR inline)
|
||||
- Implemented test suite (or acknowledged gaps)
|
||||
|
||||
### Phase 2 - Gate Decision (Required if `enable_gate_decision: true`)
|
||||
|
||||
- Test execution results (CI/CD test reports, pass/fail rates)
|
||||
- Test design with risk priorities (P0/P1/P2/P3)
|
||||
|
||||
### Recommended
|
||||
|
||||
- `test-design.md` - Risk assessment and test priorities
|
||||
- `nfr-assessment.md` - Non-functional requirements validation (for release gates)
|
||||
- `tech-spec.md` - Technical implementation details
|
||||
- Test framework configuration (playwright.config.ts, jest.config.js)
|
||||
|
||||
**Halt Conditions:**
|
||||
|
||||
- Story lacks any tests AND gaps are not acknowledged → Run `*atdd` first
|
||||
- Acceptance criteria are completely missing → Provide criteria or story file
|
||||
- Phase 2 enabled but test execution results missing → Warn and skip gate decision
|
||||
|
||||
---
|
||||
|
||||
## Usage
|
||||
|
||||
### Basic Usage (Both Phases)
|
||||
|
||||
```bash
|
||||
bmad tea *trace
|
||||
```
|
||||
|
||||
The workflow will:
|
||||
|
||||
1. **Phase 1**: Read story file, extract acceptance criteria, auto-discover tests, generate traceability matrix
|
||||
2. **Phase 2**: Load test execution results, apply decision rules, generate gate decision document
|
||||
3. Save traceability matrix to `bmad/output/traceability-matrix.md`
|
||||
4. Save gate decision to `bmad/output/gate-decision-story-X.X.md`
|
||||
|
||||
### Phase 1 Only (Skip Gate Decision)
|
||||
|
||||
```bash
|
||||
bmad tea *trace --enable-gate-decision false
|
||||
```
|
||||
|
||||
### Custom Configuration
|
||||
|
||||
```bash
|
||||
bmad tea *trace \
|
||||
--story-file "bmad/output/story-1.3.md" \
|
||||
--test-results "ci-artifacts/test-report.xml" \
|
||||
--min-p0-coverage 100 \
|
||||
--min-p1-coverage 90 \
|
||||
--min-p0-pass-rate 100 \
|
||||
--min-p1-pass-rate 95
|
||||
```
|
||||
|
||||
### Standalone Mode (No Story File)
|
||||
|
||||
```bash
|
||||
bmad tea *trace --acceptance-criteria "AC-1: User can login with email..."
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Workflow Steps
|
||||
|
||||
### PHASE 1: Requirements Traceability
|
||||
|
||||
1. **Load Context** - Read story, test design, tech spec, knowledge base
|
||||
2. **Discover Tests** - Auto-find tests related to story (by ID, describe blocks, file paths)
|
||||
3. **Map Criteria** - Link acceptance criteria to specific test cases
|
||||
4. **Analyze Gaps** - Identify missing coverage and prioritize by risk
|
||||
5. **Verify Quality** - Check test quality (assertions, structure, performance)
|
||||
6. **Generate Deliverables** - Create traceability matrix, gate YAML, coverage badge
|
||||
|
||||
### PHASE 2: Quality Gate Decision (if `enable_gate_decision: true`)
|
||||
|
||||
7. **Gather Evidence** - Load traceability results, test execution reports, NFR assessments
|
||||
8. **Apply Decision Rules** - Evaluate against thresholds (PASS/CONCERNS/FAIL/WAIVED)
|
||||
9. **Document Decision** - Create gate decision document with evidence and rationale
|
||||
10. **Update Status & Notify** - Append to bmm-workflow-status.md, notify stakeholders
|
||||
|
||||
---
|
||||
|
||||
## Outputs
|
||||
|
||||
### Phase 1: Traceability Matrix (`traceability-matrix.md`)
|
||||
|
||||
Comprehensive markdown file with:
|
||||
|
||||
- Coverage summary table (by priority)
|
||||
- Detailed criterion-to-test mapping
|
||||
- Gap analysis with recommendations
|
||||
- Quality assessment for each test
|
||||
- Gate YAML snippet
|
||||
|
||||
**Example:**
|
||||
|
||||
```markdown
|
||||
# Traceability Matrix - Story 1.3
|
||||
|
||||
## Coverage Summary
|
||||
|
||||
| Priority | Total | FULL | Coverage % | Status |
|
||||
| -------- | ----- | ---- | ---------- | ------- |
|
||||
| P0 | 3 | 3 | 100% | ✅ PASS |
|
||||
| P1 | 5 | 4 | 80% | ⚠️ WARN |
|
||||
|
||||
Gate Status: CONCERNS ⚠️ (P1 coverage below 90%)
|
||||
```
|
||||
|
||||
### Phase 2: Gate Decision Document (`gate-decision-{type}-{id}.md`)
|
||||
|
||||
**Decision Document** with:
|
||||
|
||||
- **Decision**: PASS / CONCERNS / FAIL / WAIVED with clear rationale
|
||||
- **Evidence Summary**: Test results, coverage, NFRs, quality validation
|
||||
- **Decision Criteria Table**: Each criterion with threshold, actual, status
|
||||
- **Rationale**: Explanation of decision based on evidence
|
||||
- **Residual Risks**: Unresolved issues (for CONCERNS/WAIVED)
|
||||
- **Waiver Details**: Approver, justification, remediation plan (for WAIVED)
|
||||
- **Next Steps**: Action items for each decision type
|
||||
|
||||
**Example:**
|
||||
|
||||
```markdown
|
||||
# Quality Gate Decision: Story 1.3 - User Login
|
||||
|
||||
**Decision**: ⚠️ CONCERNS
|
||||
**Date**: 2025-10-15
|
||||
|
||||
## Decision Criteria
|
||||
|
||||
| Criterion | Threshold | Actual | Status |
|
||||
| ------------ | --------- | ------ | ------- |
|
||||
| P0 Coverage | ≥100% | 100% | ✅ PASS |
|
||||
| P1 Coverage | ≥90% | 88% | ⚠️ FAIL |
|
||||
| Overall Pass | ≥90% | 96% | ✅ PASS |
|
||||
|
||||
**Decision**: CONCERNS (P1 coverage 88% below 90% threshold)
|
||||
|
||||
## Next Steps
|
||||
|
||||
- Deploy with monitoring
|
||||
- Create follow-up story for AC-5 test
|
||||
```
|
||||
|
||||
### Secondary Outputs
|
||||
|
||||
- **Gate YAML**: Machine-readable snippet for CI/CD integration
|
||||
- **Status Update**: Appends decision to `bmm-workflow-status.md` history
|
||||
- **Stakeholder Notification**: Auto-generated summary message
|
||||
- **Updated Story File**: Traceability section added (optional)
|
||||
|
||||
---
|
||||
|
||||
## Decision Logic (Phase 2)
|
||||
|
||||
### PASS Decision ✅
|
||||
|
||||
**All criteria met:**
|
||||
|
||||
- ✅ P0 coverage ≥ 100%
|
||||
- ✅ P1 coverage ≥ 90%
|
||||
- ✅ Overall coverage ≥ 80%
|
||||
- ✅ P0 test pass rate = 100%
|
||||
- ✅ P1 test pass rate ≥ 95%
|
||||
- ✅ Overall test pass rate ≥ 90%
|
||||
- ✅ Security issues = 0
|
||||
- ✅ Critical NFR failures = 0
|
||||
|
||||
**Action:** Deploy to production with standard monitoring
|
||||
|
||||
---
|
||||
|
||||
### CONCERNS Decision ⚠️
|
||||
|
||||
**P0 criteria met, but P1 criteria degraded:**
|
||||
|
||||
- ✅ P0 coverage = 100%
|
||||
- ⚠️ P1 coverage 80-89% (below 90% threshold)
|
||||
- ⚠️ P1 test pass rate 90-94% (below 95% threshold)
|
||||
- ✅ No security issues
|
||||
- ✅ No critical NFR failures
|
||||
|
||||
**Residual Risks:** Minor P1 issues, edge cases, non-critical gaps
|
||||
|
||||
**Action:** Deploy with enhanced monitoring, create backlog stories for fixes
|
||||
|
||||
**Note:** CONCERNS does NOT block deployment but requires acknowledgment
|
||||
|
||||
---
|
||||
|
||||
### FAIL Decision ❌
|
||||
|
||||
**Any P0 criterion failed:**
|
||||
|
||||
- ❌ P0 coverage <100% (missing critical tests)
|
||||
- OR ❌ P0 test pass rate <100% (failing critical tests)
|
||||
- OR ❌ P1 coverage <80% (significant gap)
|
||||
- OR ❌ Security issues >0
|
||||
- OR ❌ Critical NFR failures >0
|
||||
|
||||
**Critical Blockers:** P0 test failures, security vulnerabilities, critical NFRs
|
||||
|
||||
**Action:** Block deployment, fix critical issues, re-run gate after fixes
|
||||
|
||||
---
|
||||
|
||||
### WAIVED Decision 🔓
|
||||
|
||||
**FAIL status + business-approved waiver:**
|
||||
|
||||
- ❌ Original decision: FAIL
|
||||
- 🔓 Waiver approved by: {VP Engineering / CTO / Product Owner}
|
||||
- 📋 Business justification: {regulatory deadline, contractual obligation}
|
||||
- 📅 Waiver expiry: {date - does NOT apply to future releases}
|
||||
- 🔧 Remediation plan: {fix in next release, due date}
|
||||
|
||||
**Action:** Deploy with business approval, aggressive monitoring, fix ASAP
|
||||
|
||||
**Important:** Waivers NEVER apply to P0 security issues or data corruption risks
|
||||
|
||||
---
|
||||
|
||||
## Coverage Classifications (Phase 1)
|
||||
|
||||
- **FULL** ✅ - All scenarios validated at appropriate level(s)
|
||||
- **PARTIAL** ⚠️ - Some coverage but missing edge cases or levels
|
||||
- **NONE** ❌ - No test coverage at any level
|
||||
- **UNIT-ONLY** ⚠️ - Only unit tests (missing integration/E2E validation)
|
||||
- **INTEGRATION-ONLY** ⚠️ - Only API/Component tests (missing unit confidence)
|
||||
|
||||
---
|
||||
|
||||
## Quality Gates
|
||||
|
||||
| Priority | Coverage Requirement | Pass Rate Requirement | Severity | Action |
|
||||
| -------- | -------------------- | --------------------- | -------- | ------------------ |
|
||||
| P0 | 100% | 100% | BLOCKER | Do not release |
|
||||
| P1 | 90% | 95% | HIGH | Block PR merge |
|
||||
| P2 | 80% (recommended) | 85% (recommended) | MEDIUM | Address in nightly |
|
||||
| P3 | No requirement | No requirement | LOW | Optional |
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
### workflow.yaml Variables
|
||||
|
||||
```yaml
|
||||
variables:
|
||||
# Target specification
|
||||
story_file: '' # Path to story markdown
|
||||
acceptance_criteria: '' # Inline criteria if no story
|
||||
|
||||
# Test discovery
|
||||
test_dir: '{project-root}/tests'
|
||||
auto_discover_tests: true
|
||||
|
||||
# Traceability configuration
|
||||
coverage_levels: 'e2e,api,component,unit'
|
||||
map_by_test_id: true
|
||||
map_by_describe: true
|
||||
map_by_filename: true
|
||||
|
||||
# Gap analysis
|
||||
prioritize_by_risk: true
|
||||
suggest_missing_tests: true
|
||||
check_duplicate_coverage: true
|
||||
|
||||
# Output configuration
|
||||
output_file: '{output_folder}/traceability-matrix.md'
|
||||
generate_gate_yaml: true
|
||||
generate_coverage_badge: true
|
||||
update_story_file: true
|
||||
|
||||
# Quality gates (Phase 1 recommendations)
|
||||
min_p0_coverage: 100
|
||||
min_p1_coverage: 90
|
||||
min_overall_coverage: 80
|
||||
|
||||
# PHASE 2: Gate Decision Variables
|
||||
enable_gate_decision: true # Run gate decision after traceability
|
||||
|
||||
# Gate target specification
|
||||
gate_type: 'story' # story | epic | release | hotfix
|
||||
|
||||
# Gate decision configuration
|
||||
decision_mode: 'deterministic' # deterministic | manual
|
||||
allow_waivers: true
|
||||
require_evidence: true
|
||||
|
||||
# Input sources for gate
|
||||
nfr_file: '' # Path to nfr-assessment.md (optional)
|
||||
test_results: '' # Path to test execution results (required for Phase 2)
|
||||
|
||||
# Decision criteria thresholds
|
||||
min_p0_pass_rate: 100
|
||||
min_p1_pass_rate: 95
|
||||
min_overall_pass_rate: 90
|
||||
max_critical_nfrs_fail: 0
|
||||
max_security_issues: 0
|
||||
|
||||
# Risk tolerance
|
||||
allow_p2_failures: true
|
||||
allow_p3_failures: true
|
||||
escalate_p1_failures: true
|
||||
|
||||
# Gate output configuration
|
||||
gate_output_file: '{output_folder}/gate-decision-{gate_type}-{story_id}.md'
|
||||
append_to_history: true
|
||||
notify_stakeholders: true
|
||||
|
||||
# Advanced gate options
|
||||
check_all_workflows_complete: true
|
||||
validate_evidence_freshness: true
|
||||
require_sign_off: false
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Knowledge Base Integration
|
||||
|
||||
This workflow automatically loads relevant knowledge fragments:
|
||||
|
||||
**Phase 1 (Traceability):**
|
||||
|
||||
- `traceability.md` - Requirements mapping patterns
|
||||
- `test-priorities.md` - P0/P1/P2/P3 risk framework
|
||||
- `risk-governance.md` - Risk-based testing approach
|
||||
- `test-quality.md` - Definition of Done for tests
|
||||
- `selective-testing.md` - Duplicate coverage patterns
|
||||
|
||||
**Phase 2 (Gate Decision):**
|
||||
|
||||
- `risk-governance.md` - Quality gate criteria and decision framework
|
||||
- `probability-impact.md` - Risk scoring for residual risks
|
||||
- `test-quality.md` - Quality standards validation
|
||||
- `test-priorities.md` - Priority classification framework
|
||||
|
||||
---
|
||||
|
||||
## Example Scenarios
|
||||
|
||||
### Example 1: Full Coverage with Gate PASS
|
||||
|
||||
```bash
|
||||
# Validate coverage and make gate decision
|
||||
bmad tea *trace --story-file "bmad/output/story-1.3.md" \
|
||||
--test-results "ci-artifacts/test-report.xml"
|
||||
```
|
||||
|
||||
**Phase 1 Output:**
|
||||
|
||||
```markdown
|
||||
# Traceability Matrix - Story 1.3
|
||||
|
||||
## Coverage Summary
|
||||
|
||||
| Priority | Total | FULL | Coverage % | Status |
|
||||
| -------- | ----- | ---- | ---------- | ------- |
|
||||
| P0 | 3 | 3 | 100% | ✅ PASS |
|
||||
| P1 | 5 | 5 | 100% | ✅ PASS |
|
||||
|
||||
Gate Status: Ready for Phase 2 ✅
|
||||
```
|
||||
|
||||
**Phase 2 Output:**
|
||||
|
||||
```markdown
|
||||
# Quality Gate Decision: Story 1.3
|
||||
|
||||
**Decision**: ✅ PASS
|
||||
|
||||
Evidence:
|
||||
|
||||
- P0 Coverage: 100% ✅
|
||||
- P1 Coverage: 100% ✅
|
||||
- P0 Pass Rate: 100% (12/12 tests) ✅
|
||||
- P1 Pass Rate: 98% (45/46 tests) ✅
|
||||
- Overall Pass Rate: 96% ✅
|
||||
|
||||
Next Steps:
|
||||
|
||||
1. Deploy to staging
|
||||
2. Monitor for 24 hours
|
||||
3. Deploy to production
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Example 2: Gap Identification with CONCERNS Decision
|
||||
|
||||
```bash
|
||||
# Find gaps and evaluate readiness
|
||||
bmad tea *trace --story-file "bmad/output/story-2.1.md" \
|
||||
--test-results "ci-artifacts/test-report.xml"
|
||||
```
|
||||
|
||||
**Phase 1 Output:**
|
||||
|
||||
```markdown
|
||||
## Gap Analysis
|
||||
|
||||
### Critical Gaps (BLOCKER)
|
||||
|
||||
- None ✅
|
||||
|
||||
### High Priority Gaps (PR BLOCKER)
|
||||
|
||||
1. **AC-3: Password reset email edge cases**
|
||||
- Recommend: Add 1.3-API-001 (email service integration)
|
||||
- Impact: Users may not recover accounts in error scenarios
|
||||
```
|
||||
|
||||
**Phase 2 Output:**
|
||||
|
||||
```markdown
|
||||
# Quality Gate Decision: Story 2.1
|
||||
|
||||
**Decision**: ⚠️ CONCERNS
|
||||
|
||||
Evidence:
|
||||
|
||||
- P0 Coverage: 100% ✅
|
||||
- P1 Coverage: 88% ⚠️ (below 90%)
|
||||
- Test Pass Rate: 96% ✅
|
||||
|
||||
Residual Risks:
|
||||
|
||||
- AC-3 missing E2E test for email error handling
|
||||
|
||||
Next Steps:
|
||||
|
||||
- Deploy with monitoring
|
||||
- Create follow-up story for AC-3 test
|
||||
- Monitor production for edge cases
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Example 3: Critical Blocker with FAIL Decision
|
||||
|
||||
```bash
|
||||
# Critical issues detected
|
||||
bmad tea *trace --story-file "bmad/output/story-3.2.md" \
|
||||
--test-results "ci-artifacts/test-report.xml"
|
||||
```
|
||||
|
||||
**Phase 1 Output:**
|
||||
|
||||
```markdown
|
||||
## Gap Analysis
|
||||
|
||||
### Critical Gaps (BLOCKER)
|
||||
|
||||
1. **AC-2: Invalid login security validation**
|
||||
- Priority: P0
|
||||
- Status: NONE (no tests)
|
||||
- Impact: Security vulnerability - users can bypass login
|
||||
```
|
||||
|
||||
**Phase 2 Output:**
|
||||
|
||||
```markdown
|
||||
# Quality Gate Decision: Story 3.2
|
||||
|
||||
**Decision**: ❌ FAIL
|
||||
|
||||
Critical Blockers:
|
||||
|
||||
- P0 Coverage: 80% ❌ (AC-2 missing)
|
||||
- Security Risk: Login bypass vulnerability
|
||||
|
||||
Next Steps:
|
||||
|
||||
1. BLOCK DEPLOYMENT IMMEDIATELY
|
||||
2. Add P0 test for AC-2: 1.3-E2E-004
|
||||
3. Re-run full test suite
|
||||
4. Re-run gate after fixes verified
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Example 4: Business Override with WAIVED Decision
|
||||
|
||||
```bash
|
||||
# FAIL with business waiver
|
||||
bmad tea *trace --story-file "bmad/output/release-2.4.0.md" \
|
||||
--test-results "ci-artifacts/test-report.xml" \
|
||||
--allow-waivers true
|
||||
```
|
||||
|
||||
**Phase 2 Output:**
|
||||
|
||||
```markdown
|
||||
# Quality Gate Decision: Release 2.4.0
|
||||
|
||||
**Original Decision**: ❌ FAIL
|
||||
**Final Decision**: 🔓 WAIVED
|
||||
|
||||
Waiver Details:
|
||||
|
||||
- Approver: Jane Doe, VP Engineering
|
||||
- Reason: GDPR compliance deadline (regulatory, Oct 15)
|
||||
- Expiry: 2025-10-15 (does NOT apply to v2.5.0)
|
||||
- Monitoring: Enhanced error tracking
|
||||
- Remediation: Fix in v2.4.1 hotfix (due Oct 20)
|
||||
|
||||
Business Justification:
|
||||
Release contains critical GDPR features required by law. Failed
|
||||
test affects legacy feature used by <1% of users. Workaround available.
|
||||
|
||||
Next Steps:
|
||||
|
||||
1. Deploy v2.4.0 with waiver approval
|
||||
2. Monitor error rates aggressively
|
||||
3. Fix issue in v2.4.1 (Oct 20)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Phase 1 Issues
|
||||
|
||||
#### "No tests found for this story"
|
||||
|
||||
- Run `*atdd` workflow first to generate failing acceptance tests
|
||||
- Check test file naming conventions (may not match story ID pattern)
|
||||
- Verify test directory path is correct (`test_dir` variable)
|
||||
|
||||
#### "Cannot determine coverage status"
|
||||
|
||||
- Tests may lack explicit mapping (no test IDs, unclear describe blocks)
|
||||
- Add test IDs: `{STORY_ID}-{LEVEL}-{SEQ}` (e.g., `1.3-E2E-001`)
|
||||
- Use Given-When-Then narrative in test descriptions
|
||||
|
||||
#### "P0 coverage below 100%"
|
||||
|
||||
- This is a **BLOCKER** - do not release
|
||||
- Identify missing P0 tests in gap analysis
|
||||
- Run `*atdd` workflow to generate missing tests
|
||||
- Verify P0 classification is correct with stakeholders
|
||||
|
||||
#### "Duplicate coverage detected"
|
||||
|
||||
- Review `selective-testing.md` knowledge fragment
|
||||
- Determine if overlap is acceptable (defense in depth) or wasteful
|
||||
- Consolidate tests at appropriate level (logic → unit, journey → E2E)
|
||||
|
||||
### Phase 2 Issues
|
||||
|
||||
#### "Test execution results missing"
|
||||
|
||||
- Phase 2 gate decision requires `test_results` (CI/CD test reports)
|
||||
- If missing, Phase 2 will be skipped with warning
|
||||
- Provide JUnit XML, TAP, or JSON test report path via `test_results` variable
|
||||
|
||||
#### "Gate decision is FAIL but deployment needed urgently"
|
||||
|
||||
- Request business waiver (if `allow_waivers: true`)
|
||||
- Document approver, justification, mitigation plan
|
||||
- Create follow-up stories to address gaps
|
||||
- Use WAIVED decision only for non-P0 gaps
|
||||
- **Never waive**: Security issues, data corruption risks
|
||||
|
||||
#### "Assessments are stale (>7 days old)"
|
||||
|
||||
- Re-run `*test-design` workflow
|
||||
- Re-run traceability (Phase 1)
|
||||
- Re-run `*nfr-assess` workflow
|
||||
- Update evidence files before gate decision
|
||||
|
||||
#### "Unclear decision (edge case)"
|
||||
|
||||
- Switch to manual mode: `decision_mode: manual`
|
||||
- Document assumptions and rationale clearly
|
||||
- Escalate to tech lead or architect for guidance
|
||||
- Consider waiver if business-critical
|
||||
|
||||
---
|
||||
|
||||
## Integration with Other Workflows
|
||||
|
||||
### Before Trace
|
||||
|
||||
1. **testarch-test-design** - Define test priorities (P0/P1/P2/P3)
|
||||
2. **testarch-atdd** - Generate failing acceptance tests
|
||||
3. **testarch-automate** - Expand regression suite
|
||||
|
||||
### After Trace (Phase 2 Decision)
|
||||
|
||||
- **PASS**: Proceed to deployment workflow
|
||||
- **CONCERNS**: Deploy with monitoring, create remediation backlog stories
|
||||
- **FAIL**: Block deployment, fix issues, re-run trace workflow
|
||||
- **WAIVED**: Deploy with business approval, escalate monitoring
|
||||
|
||||
### Complements
|
||||
|
||||
- `*trace` → **testarch-nfr-assess** - Use NFR validation in gate decision
|
||||
- `*trace` → **testarch-test-review** - Flag quality issues for review
|
||||
- **CI/CD Pipeline** - Use gate YAML for automated quality gates
|
||||
|
||||
---
|
||||
|
||||
## Best Practices
|
||||
|
||||
### Phase 1 - Traceability
|
||||
|
||||
1. **Run Trace After Test Implementation**
|
||||
- Don't run `*trace` before tests exist (run `*atdd` first)
|
||||
- Trace is most valuable after initial test suite is written
|
||||
|
||||
2. **Prioritize by Risk**
|
||||
- P0 gaps are BLOCKERS (must fix before release)
|
||||
- P1 gaps are HIGH priority (block PR merge)
|
||||
- P3 gaps are acceptable (fix if time permits)
|
||||
|
||||
3. **Explicit Mapping**
|
||||
- Use test IDs (`1.3-E2E-001`) for clear traceability
|
||||
- Reference criteria in describe blocks
|
||||
- Use Given-When-Then narrative
|
||||
|
||||
4. **Avoid Duplicate Coverage**
|
||||
- Test each behavior at appropriate level only
|
||||
- Unit tests for logic, E2E for journeys
|
||||
- Only overlap for defense in depth on critical paths
|
||||
|
||||
### Phase 2 - Gate Decision
|
||||
|
||||
5. **Evidence is King**
|
||||
- Never make gate decisions without fresh test results
|
||||
- Validate evidence freshness (<7 days old)
|
||||
- Link to all evidence sources (reports, logs, artifacts)
|
||||
|
||||
6. **P0 is Sacred**
|
||||
- P0 failures ALWAYS result in FAIL (no exceptions except waivers)
|
||||
- P0 = Critical user journeys, security, data integrity
|
||||
- Waivers require VP/CTO approval + business justification
|
||||
|
||||
7. **Waivers are Temporary**
|
||||
- Waiver applies ONLY to specific release
|
||||
- Issue must be fixed in next release
|
||||
- Never waive: security, data corruption, compliance violations
|
||||
|
||||
8. **CONCERNS is Not PASS**
|
||||
- CONCERNS means "deploy with monitoring"
|
||||
- Create follow-up stories for issues
|
||||
- Do not ignore CONCERNS repeatedly
|
||||
|
||||
9. **Automate Gate Integration**
|
||||
- Enable `generate_gate_yaml` for CI/CD integration
|
||||
- Use YAML snippets in pipeline quality gates
|
||||
- Export metrics for dashboard visualization
|
||||
|
||||
---
|
||||
|
||||
## Configuration Examples
|
||||
|
||||
### Strict Gate (Zero Tolerance)
|
||||
|
||||
```yaml
|
||||
min_p0_coverage: 100
|
||||
min_p1_coverage: 100
|
||||
min_overall_coverage: 90
|
||||
min_p0_pass_rate: 100
|
||||
min_p1_pass_rate: 100
|
||||
min_overall_pass_rate: 95
|
||||
allow_waivers: false
|
||||
max_security_issues: 0
|
||||
max_critical_nfrs_fail: 0
|
||||
```
|
||||
|
||||
Use for: Financial systems, healthcare, security-critical features
|
||||
|
||||
---
|
||||
|
||||
### Balanced Gate (Production Standard - Default)
|
||||
|
||||
```yaml
|
||||
min_p0_coverage: 100
|
||||
min_p1_coverage: 90
|
||||
min_overall_coverage: 80
|
||||
min_p0_pass_rate: 100
|
||||
min_p1_pass_rate: 95
|
||||
min_overall_pass_rate: 90
|
||||
allow_waivers: true
|
||||
max_security_issues: 0
|
||||
max_critical_nfrs_fail: 0
|
||||
```
|
||||
|
||||
Use for: Most production releases
|
||||
|
||||
---
|
||||
|
||||
### Relaxed Gate (Early Development)
|
||||
|
||||
```yaml
|
||||
min_p0_coverage: 100
|
||||
min_p1_coverage: 80
|
||||
min_overall_coverage: 70
|
||||
min_p0_pass_rate: 100
|
||||
min_p1_pass_rate: 85
|
||||
min_overall_pass_rate: 80
|
||||
allow_waivers: true
|
||||
allow_p2_failures: true
|
||||
allow_p3_failures: true
|
||||
```
|
||||
|
||||
Use for: Alpha/beta releases, internal tools, proof-of-concept
|
||||
|
||||
---
|
||||
|
||||
## Related Commands
|
||||
|
||||
- `bmad tea *test-design` - Define test priorities and risk assessment
|
||||
- `bmad tea *atdd` - Generate failing acceptance tests for gaps
|
||||
- `bmad tea *automate` - Expand regression suite based on gaps
|
||||
- `bmad tea *nfr-assess` - Validate non-functional requirements (for gate)
|
||||
- `bmad tea *test-review` - Review test quality issues flagged by trace
|
||||
- `bmad sm story-done` - Mark story as complete (triggers gate)
|
||||
|
||||
---
|
||||
|
||||
## Resources
|
||||
|
||||
- [Instructions](./instructions.md) - Detailed workflow steps (both phases)
|
||||
- [Checklist](./checklist.md) - Validation checklist
|
||||
- [Template](./trace-template.md) - Traceability matrix template
|
||||
- [Knowledge Base](../../testarch/knowledge/) - Testing best practices
|
||||
|
||||
---
|
||||
|
||||
<!-- Powered by BMAD-CORE™ -->
|
||||
Reference in New Issue
Block a user