feat: add universal document sharding support with dual-strategy loading

Implement comprehensive document sharding system across all BMM workflows enabling 90%+ token savings for large multi-epic projects through selective loading optimization.

## Document Sharding System

### Core Features
- **Universal Support**: All 12 BMM workflows (Phase 1-4) handle both whole and sharded documents
- **Dual Loading Strategy**: Full Load (Phase 1-3) vs Selective Load (Phase 4)
- **Automatic Discovery**: Workflows detect format transparently (whole → sharded priority)
- **Efficiency Optimization**: 90%+ token reduction for 10+ epic projects in Phase 4

### Implementation Details

**Phase 1-3 Workflows (7 workflows) - Full Load Strategy:**
- product-brief, prd, gdd, create-ux-design, tech-spec, architecture, solutioning-gate-check
- Load entire sharded documents when present
- Transparent to user experience
- Better organization for large projects

**Phase 4 Workflows (5 workflows) - Selective Load Strategy:**
- sprint-planning (Full Load exception - needs all epics)
- epic-tech-context, create-story, story-context, code-review (Selective Load)
- Load ONLY the specific epic needed (e.g., epic-3.md for Epic 3 stories)
- Massive efficiency: Skip loading 9 other epics in 10-epic project

### Workflow Enhancements

**Added to all workflows:**
- `input_file_patterns` in workflow.yaml with wildcard discovery
- Document Discovery section in instructions.md
- Support for sharded index + section files
- Brownfield `docs/index.md` support

**Pattern standardization:**
```yaml
input_file_patterns:
  document:
    whole: "{output_folder}/*doc*.md"
    sharded: "{output_folder}/*doc*/index.md"
    sharded_single: "{output_folder}/*doc*/section-{{id}}.md"  # Selective load
```

### Retrospective Workflow Major Overhaul

Transformed retrospective into immersive, interactive team experience:

**Epic Discovery Priority (Fixed):**
- Priority 1: Check sprint-status.yaml for last completed epic
- Priority 2: Ask user directly
- Priority 3: Scan stories folder (last resort)

**New Capabilities:**
- Deep story analysis: Extract dev notes, mistakes, review feedback, lessons learned
- Previous retro integration: Track action items, verify lessons applied
- Significant change detection: Alert when discoveries require epic updates
- Intent-based facilitation: Natural conversation vs scripted phrases
- Party mode protocol: Clear speaker identification (Name (Role): dialogue)
- Team dynamics: Drama, disagreements, diverse perspectives, authentic conflict

**Structure:**
- 12 whole-number steps (no decimals)
- Highly interactive with constant user engagement
- Cross-references previous retro for accountability
- Synthesizes patterns across all stories
- Detects architectural assumption changes

## Documentation

**Created:**
- `docs/document-sharding-guide.md` - Comprehensive 300+ line guide
  - What is sharding, when to use it (token thresholds)
  - How sharding works (discovery system, loading strategies)
  - Using shard-doc tool
  - Full Load vs Selective Load patterns
  - Complete examples and troubleshooting
  - Custom workflow integration patterns

**Updated:**
- `README.md` - Added Document Sharding feature section
- `docs/index.md` - Added under Advanced Topics → Optimization
- `src/modules/bmm/workflows/README.md` - Added sharding section with usage
- `src/modules/bmb/workflows/create-workflow/workflow-creation-guide.md` - Added complete implementation patterns for workflow builders

**Documentation levels:**
1. Overview (README.md) - Quick feature highlight
2. User guide (BMM workflows README) - Practical usage
3. Reference (document-sharding-guide.md) - Complete details
4. Builder guide (workflow-creation-guide.md) - Implementation patterns

## Efficiency Gains

**Example: 10-Epic Project**

Before sharding:
- epic-tech-context for Epic 3: Load all 10 epics (~50k tokens)
- create-story for Epic 3: Load all 10 epics (~50k tokens)
- story-context for Epic 3: Load all 10 epics (~50k tokens)

After sharding with selective load:
- epic-tech-context for Epic 3: Load Epic 3 only (~5k tokens) = 90% reduction
- create-story for Epic 3: Load Epic 3 only (~5k tokens) = 90% reduction
- story-context for Epic 3: Load Epic 3 only (~5k tokens) = 90% reduction

## Breaking Changes

None - fully backward compatible. Workflows work with existing whole documents.

## Files Changed

**Workflows Updated (25 files):**
- 7 Phase 1-3 workflows: Added full load sharding support
- 5 Phase 4 workflows: Added selective load sharding support
- 1 retrospective workflow: Complete overhaul with sharding support

**Documentation (5 files):**
- Created: document-sharding-guide.md
- Updated: README.md, docs/index.md, BMM workflows README, BMB workflow-creation-guide
- Removed: Old conversion report (obsolete)

## Future Extensibility

- BMB workflows now aware of sharding patterns
- Custom modules can easily implement sharding support
- Standard patterns documented for consistency
- No need to explain concept in future development
This commit is contained in:
Brian Madison
2025-11-02 00:13:33 -05:00
parent f77babcd5e
commit 3d4ea5ffd2
32 changed files with 2397 additions and 437 deletions

View File

@@ -16,6 +16,35 @@
<critical>DOCUMENT OUTPUT: Technical review reports. Structured findings with severity levels and action items. User skill level ({user_skill_level}) affects conversation style ONLY, not review content.</critical>
## 📚 Document Discovery - Selective Epic Loading
**Strategy**: This workflow needs only ONE specific epic and its stories for review context, not all epics. This provides huge efficiency gains when epics are sharded.
**Epic Discovery Process (SELECTIVE OPTIMIZATION):**
1. **Determine which epic** you need (epic_num from story being reviewed - e.g., story "3-2-feature-name" needs Epic 3)
2. **Check for sharded version**: Look for `epics/index.md`
3. **If sharded version found**:
- Read `index.md` to understand structure
- **Load ONLY `epic-{epic_num}.md`** (e.g., `epics/epic-3.md` for Epic 3)
- DO NOT load all epic files - only the one needed!
- This is the key efficiency optimization for large multi-epic projects
4. **If whole document found**: Load the complete `epics.md` file and extract the relevant epic
**Other Documents (architecture, ux-design) - Full Load:**
1. **Search for whole document first** - Use fuzzy file matching
2. **Check for sharded version** - If whole document not found, look for `{doc-name}/index.md`
3. **If sharded version found**:
- Read `index.md` to understand structure
- Read ALL section files listed in the index
- Treat combined content as single document
4. **Brownfield projects**: The `document-project` workflow creates `{output_folder}/docs/index.md`
**Priority**: If both whole and sharded versions exist, use the whole document.
**UX-Heavy Projects**: Always check for ux-design documentation as it provides critical context for reviewing UI-focused stories.
<workflow>
<step n="1" goal="Find story ready for review" tag="sprint-status">

View File

@@ -51,6 +51,26 @@ recommended_inputs:
- tech_spec: "Epic technical specification document (auto-discovered)"
- story_context_file: "Story context file (.context.xml) (auto-discovered)"
# Smart input file references - handles both whole docs and sharded docs
# Priority: Whole document first, then sharded version
# Strategy: SELECTIVE LOAD - only load the specific epic needed for this story review
input_file_patterns:
architecture:
whole: "{output_folder}/*architecture*.md"
sharded: "{output_folder}/*architecture*/index.md"
ux_design:
whole: "{output_folder}/*ux*.md"
sharded: "{output_folder}/*ux*/index.md"
epics:
whole: "{output_folder}/*epic*.md"
sharded_index: "{output_folder}/*epic*/index.md"
sharded_single: "{output_folder}/*epic*/epic-{{epic_num}}.md"
document_project:
sharded: "{output_folder}/docs/index.md"
standalone: true
web_bundle: false

View File

@@ -7,6 +7,35 @@
<critical>This workflow creates or updates the next user story from epics/PRD and architecture context, saving to the configured stories directory and optionally invoking Story Context.</critical>
<critical>DOCUMENT OUTPUT: Concise, technical, actionable story specifications. Use tables/lists for acceptance criteria and tasks.</critical>
## 📚 Document Discovery - Selective Epic Loading
**Strategy**: This workflow needs only ONE specific epic and its stories, not all epics. This provides huge efficiency gains when epics are sharded.
**Epic Discovery Process (SELECTIVE OPTIMIZATION):**
1. **Determine which epic** you need (epic_num from story context - e.g., story "3-2-feature-name" needs Epic 3)
2. **Check for sharded version**: Look for `epics/index.md`
3. **If sharded version found**:
- Read `index.md` to understand structure
- **Load ONLY `epic-{epic_num}.md`** (e.g., `epics/epic-3.md` for Epic 3)
- DO NOT load all epic files - only the one needed!
- This is the key efficiency optimization for large multi-epic projects
4. **If whole document found**: Load the complete `epics.md` file and extract the relevant epic
**Other Documents (prd, architecture, ux-design) - Full Load:**
1. **Search for whole document first** - Use fuzzy file matching
2. **Check for sharded version** - If whole document not found, look for `{doc-name}/index.md`
3. **If sharded version found**:
- Read `index.md` to understand structure
- Read ALL section files listed in the index
- Treat combined content as single document
4. **Brownfield projects**: The `document-project` workflow creates `{output_folder}/docs/index.md`
**Priority**: If both whole and sharded versions exist, use the whole document.
**UX-Heavy Projects**: Always check for ux-design documentation as it provides critical context for UI-focused stories.
<workflow>
<step n="1" goal="Load config and initialize">

View File

@@ -44,6 +44,30 @@ recommended_inputs:
- prd: "PRD document"
- architecture: "Architecture (optional)"
# Smart input file references - handles both whole docs and sharded docs
# Priority: Whole document first, then sharded version
# Strategy: SELECTIVE LOAD - only load the specific epic needed for this story
input_file_patterns:
prd:
whole: "{output_folder}/*prd*.md"
sharded: "{output_folder}/*prd*/index.md"
architecture:
whole: "{output_folder}/*architecture*.md"
sharded: "{output_folder}/*architecture*/index.md"
ux_design:
whole: "{output_folder}/*ux*.md"
sharded: "{output_folder}/*ux*/index.md"
epics:
whole: "{output_folder}/*epic*.md"
sharded_index: "{output_folder}/*epic*/index.md"
sharded_single: "{output_folder}/*epic*/epic-{{epic_num}}.md"
document_project:
sharded: "{output_folder}/docs/index.md"
standalone: true
web_bundle: false

View File

@@ -7,6 +7,35 @@
<critical>This workflow generates a comprehensive Technical Specification from PRD and Architecture, including detailed design, NFRs, acceptance criteria, and traceability mapping.</critical>
<critical>If required inputs cannot be auto-discovered HALT with a clear message listing missing documents, allow user to provide them to proceed.</critical>
## 📚 Document Discovery - Selective Epic Loading
**Strategy**: This workflow needs only ONE specific epic and its stories, not all epics. This provides huge efficiency gains when epics are sharded.
**Epic Discovery Process (SELECTIVE OPTIMIZATION):**
1. **Determine which epic** you need (epic_num from workflow context or user input)
2. **Check for sharded version**: Look for `epics/index.md`
3. **If sharded version found**:
- Read `index.md` to understand structure
- **Load ONLY `epic-{epic_num}.md`** (e.g., `epics/epic-3.md` for Epic 3)
- DO NOT load all epic files - only the one needed!
- This is the key efficiency optimization for large multi-epic projects
4. **If whole document found**: Load the complete `epics.md` file and extract the relevant epic
**Other Documents (prd, gdd, architecture, ux-design) - Full Load:**
1. **Search for whole document first** - Use fuzzy file matching
2. **Check for sharded version** - If whole document not found, look for `{doc-name}/index.md`
3. **If sharded version found**:
- Read `index.md` to understand structure
- Read ALL section files listed in the index
- Treat combined content as single document
4. **Brownfield projects**: The `document-project` workflow creates `{output_folder}/docs/index.md`
**Priority**: If both whole and sharded versions exist, use the whole document.
**UX-Heavy Projects**: Always check for ux-design documentation as it provides critical context for UI-focused epics and stories.
<workflow>
<step n="1" goal="Collect inputs and discover next epic" tag="sprint-status">
<action>Identify PRD and Architecture documents from recommended_inputs. Attempt to auto-discover at default paths.</action>

View File

@@ -9,17 +9,43 @@ user_name: "{config_source}:user_name"
communication_language: "{config_source}:communication_language"
date: system-generated
# Inputs expected ( check output_folder or ask user if missing)
# Inputs expected (check output_folder or ask user if missing)
recommended_inputs:
- prd
- gdd
- spec
- architecture
- ux_spec
- ux-design
- if there is an index.md then read the index.md to find other related docs that could be relevant
- ux_design
- epics (only the specific epic needed for this tech spec)
- prior epic tech-specs for model, style and consistency reference
# Smart input file references - handles both whole docs and sharded docs
# Priority: Whole document first, then sharded version
# Strategy: SELECTIVE LOAD - only load the specific epic needed (epic_num from context)
input_file_patterns:
prd:
whole: "{output_folder}/*prd*.md"
sharded: "{output_folder}/*prd*/index.md"
gdd:
whole: "{output_folder}/*gdd*.md"
sharded: "{output_folder}/*gdd*/index.md"
architecture:
whole: "{output_folder}/*architecture*.md"
sharded: "{output_folder}/*architecture*/index.md"
ux_design:
whole: "{output_folder}/*ux*.md"
sharded: "{output_folder}/*ux*/index.md"
epics:
whole: "{output_folder}/*epic*.md"
sharded_index: "{output_folder}/*epic*/index.md"
sharded_single: "{output_folder}/*epic*/epic-{{epic_num}}.md"
document_project:
sharded: "{output_folder}/docs/index.md"
# Workflow components
installed_path: "{project-root}/bmad/bmm/workflows/4-implementation/epic-tech-context"
template: "{installed_path}/template.md"

View File

@@ -21,6 +21,34 @@ trigger: "Run AFTER completing an epic"
required_inputs:
- agent_manifest: "{project-root}/bmad/_cfg/agent-manifest.csv"
# Smart input file references - handles both whole docs and sharded docs
# Priority: Whole document first, then sharded version
# Strategy: SELECTIVE LOAD - only load the completed epic and relevant retrospectives
input_file_patterns:
epics:
whole: "{output_folder}/*epic*.md"
sharded_index: "{output_folder}/*epic*/index.md"
sharded_single: "{output_folder}/*epic*/epic-{{epic_num}}.md"
previous_retrospective:
pattern: "{output_folder}/retrospectives/epic-{{prev_epic_num}}-retro-*.md"
architecture:
whole: "{output_folder}/*architecture*.md"
sharded: "{output_folder}/*architecture*/index.md"
prd:
whole: "{output_folder}/*prd*.md"
sharded: "{output_folder}/*prd*/index.md"
document_project:
sharded: "{output_folder}/docs/index.md"
# Required files
sprint_status_file: "{output_folder}/sprint-status.yaml"
story_directory: "{config_source}:dev_story_location"
retrospectives_folder: "{output_folder}/retrospectives"
output_artifacts:
- retrospective_summary: "Comprehensive review of what went well and what could improve"
- lessons_learned: "Key insights for future epics"

View File

@@ -3,6 +3,23 @@
<critical>The workflow execution engine is governed by: {project-root}/bmad/core/tasks/workflow.xml</critical>
<critical>You MUST have already loaded and processed: {project-root}/bmad/bmm/workflows/4-implementation/sprint-planning/workflow.yaml</critical>
## 📚 Document Discovery - Full Epic Loading
**Strategy**: Sprint planning needs ALL epics and stories to build complete status tracking.
**Epic Discovery Process:**
1. **Search for whole document first** - Look for `epics.md`, `bmm-epics.md`, or any `*epic*.md` file
2. **Check for sharded version** - If whole document not found, look for `epics/index.md`
3. **If sharded version found**:
- Read `index.md` to understand the document structure
- Read ALL epic section files listed in the index (e.g., `epic-1.md`, `epic-2.md`, etc.)
- Process all epics and their stories from the combined content
- This ensures complete sprint status coverage
4. **Priority**: If both whole and sharded versions exist, use the whole document
**Fuzzy matching**: Be flexible with document names - users may use variations like `epics.md`, `bmm-epics.md`, `user-stories.md`, etc.
<workflow>
<step n="1" goal="Parse epic files and extract all work items">

View File

@@ -33,6 +33,14 @@ variables:
# Output configuration
status_file: "{output_folder}/sprint-status.yaml"
# Smart input file references - handles both whole docs and sharded docs
# Priority: Whole document first, then sharded version
# Strategy: FULL LOAD - sprint planning needs ALL epics to build complete status
input_file_patterns:
epics:
whole: "{output_folder}/*epic*.md"
sharded: "{output_folder}/*epic*/index.md"
# Output configuration
default_output_file: "{status_file}"

View File

@@ -11,6 +11,35 @@
<critical>DOCUMENT OUTPUT: Technical context file (.context.xml). Concise, structured, project-relative paths only.</critical>
## 📚 Document Discovery - Selective Epic Loading
**Strategy**: This workflow needs only ONE specific epic and its stories, not all epics. This provides huge efficiency gains when epics are sharded.
**Epic Discovery Process (SELECTIVE OPTIMIZATION):**
1. **Determine which epic** you need (epic_num from story key - e.g., story "3-2-feature-name" needs Epic 3)
2. **Check for sharded version**: Look for `epics/index.md`
3. **If sharded version found**:
- Read `index.md` to understand structure
- **Load ONLY `epic-{epic_num}.md`** (e.g., `epics/epic-3.md` for Epic 3)
- DO NOT load all epic files - only the one needed!
- This is the key efficiency optimization for large multi-epic projects
4. **If whole document found**: Load the complete `epics.md` file and extract the relevant epic
**Other Documents (prd, architecture, ux-design) - Full Load:**
1. **Search for whole document first** - Use fuzzy file matching
2. **Check for sharded version** - If whole document not found, look for `{doc-name}/index.md`
3. **If sharded version found**:
- Read `index.md` to understand structure
- Read ALL section files listed in the index
- Treat combined content as single document
4. **Brownfield projects**: The `document-project` workflow creates `{output_folder}/docs/index.md`
**Priority**: If both whole and sharded versions exist, use the whole document.
**UX-Heavy Projects**: Always check for ux-design documentation as it provides critical context for UI-focused stories.
<workflow>
<step n="1" goal="Find drafted story and check for existing context" tag="sprint-status">
<check if="{{story_path}} is provided">

View File

@@ -23,6 +23,30 @@ variables:
story_path: "" # Optional: Explicit story path. If not provided, finds first story with status "drafted"
story_dir: "{config_source}:dev_story_location"
# Smart input file references - handles both whole docs and sharded docs
# Priority: Whole document first, then sharded version
# Strategy: SELECTIVE LOAD - only load the specific epic needed for this story
input_file_patterns:
prd:
whole: "{output_folder}/*prd*.md"
sharded: "{output_folder}/*prd*/index.md"
architecture:
whole: "{output_folder}/*architecture*.md"
sharded: "{output_folder}/*architecture*/index.md"
ux_design:
whole: "{output_folder}/*ux*.md"
sharded: "{output_folder}/*ux*/index.md"
epics:
whole: "{output_folder}/*epic*.md"
sharded_index: "{output_folder}/*epic*/index.md"
sharded_single: "{output_folder}/*epic*/epic-{{epic_num}}.md"
document_project:
sharded: "{output_folder}/docs/index.md"
# Output configuration
# Uses story_key from sprint-status.yaml (e.g., "1-2-user-authentication")
default_output_file: "{story_dir}/{{story_key}}.context.xml"