mirror of
https://github.com/bmadcode/BMAD-METHOD.git
synced 2025-12-18 02:05:30 +00:00
Implement comprehensive document sharding system across all BMM workflows enabling 90%+ token savings for large multi-epic projects through selective loading optimization.
## Document Sharding System
### Core Features
- **Universal Support**: All 12 BMM workflows (Phase 1-4) handle both whole and sharded documents
- **Dual Loading Strategy**: Full Load (Phase 1-3) vs Selective Load (Phase 4)
- **Automatic Discovery**: Workflows detect format transparently (whole → sharded priority)
- **Efficiency Optimization**: 90%+ token reduction for 10+ epic projects in Phase 4
### Implementation Details
**Phase 1-3 Workflows (7 workflows) - Full Load Strategy:**
- product-brief, prd, gdd, create-ux-design, tech-spec, architecture, solutioning-gate-check
- Load entire sharded documents when present
- Transparent to user experience
- Better organization for large projects
**Phase 4 Workflows (5 workflows) - Selective Load Strategy:**
- sprint-planning (Full Load exception - needs all epics)
- epic-tech-context, create-story, story-context, code-review (Selective Load)
- Load ONLY the specific epic needed (e.g., epic-3.md for Epic 3 stories)
- Massive efficiency: Skip loading 9 other epics in 10-epic project
### Workflow Enhancements
**Added to all workflows:**
- `input_file_patterns` in workflow.yaml with wildcard discovery
- Document Discovery section in instructions.md
- Support for sharded index + section files
- Brownfield `docs/index.md` support
**Pattern standardization:**
```yaml
input_file_patterns:
document:
whole: "{output_folder}/*doc*.md"
sharded: "{output_folder}/*doc*/index.md"
sharded_single: "{output_folder}/*doc*/section-{{id}}.md" # Selective load
```
### Retrospective Workflow Major Overhaul
Transformed retrospective into immersive, interactive team experience:
**Epic Discovery Priority (Fixed):**
- Priority 1: Check sprint-status.yaml for last completed epic
- Priority 2: Ask user directly
- Priority 3: Scan stories folder (last resort)
**New Capabilities:**
- Deep story analysis: Extract dev notes, mistakes, review feedback, lessons learned
- Previous retro integration: Track action items, verify lessons applied
- Significant change detection: Alert when discoveries require epic updates
- Intent-based facilitation: Natural conversation vs scripted phrases
- Party mode protocol: Clear speaker identification (Name (Role): dialogue)
- Team dynamics: Drama, disagreements, diverse perspectives, authentic conflict
**Structure:**
- 12 whole-number steps (no decimals)
- Highly interactive with constant user engagement
- Cross-references previous retro for accountability
- Synthesizes patterns across all stories
- Detects architectural assumption changes
## Documentation
**Created:**
- `docs/document-sharding-guide.md` - Comprehensive 300+ line guide
- What is sharding, when to use it (token thresholds)
- How sharding works (discovery system, loading strategies)
- Using shard-doc tool
- Full Load vs Selective Load patterns
- Complete examples and troubleshooting
- Custom workflow integration patterns
**Updated:**
- `README.md` - Added Document Sharding feature section
- `docs/index.md` - Added under Advanced Topics → Optimization
- `src/modules/bmm/workflows/README.md` - Added sharding section with usage
- `src/modules/bmb/workflows/create-workflow/workflow-creation-guide.md` - Added complete implementation patterns for workflow builders
**Documentation levels:**
1. Overview (README.md) - Quick feature highlight
2. User guide (BMM workflows README) - Practical usage
3. Reference (document-sharding-guide.md) - Complete details
4. Builder guide (workflow-creation-guide.md) - Implementation patterns
## Efficiency Gains
**Example: 10-Epic Project**
Before sharding:
- epic-tech-context for Epic 3: Load all 10 epics (~50k tokens)
- create-story for Epic 3: Load all 10 epics (~50k tokens)
- story-context for Epic 3: Load all 10 epics (~50k tokens)
After sharding with selective load:
- epic-tech-context for Epic 3: Load Epic 3 only (~5k tokens) = 90% reduction
- create-story for Epic 3: Load Epic 3 only (~5k tokens) = 90% reduction
- story-context for Epic 3: Load Epic 3 only (~5k tokens) = 90% reduction
## Breaking Changes
None - fully backward compatible. Workflows work with existing whole documents.
## Files Changed
**Workflows Updated (25 files):**
- 7 Phase 1-3 workflows: Added full load sharding support
- 5 Phase 4 workflows: Added selective load sharding support
- 1 retrospective workflow: Complete overhaul with sharding support
**Documentation (5 files):**
- Created: document-sharding-guide.md
- Updated: README.md, docs/index.md, BMM workflows README, BMB workflow-creation-guide
- Removed: Old conversion report (obsolete)
## Future Extensibility
- BMB workflows now aware of sharding patterns
- Custom modules can easily implement sharding support
- Standard patterns documented for consistency
- No need to explain concept in future development
448 lines
10 KiB
Markdown
448 lines
10 KiB
Markdown
# Document Sharding Guide
|
|
|
|
Comprehensive guide to BMad Method's document sharding system for managing large planning and architecture documents.
|
|
|
|
## Table of Contents
|
|
|
|
- [What is Document Sharding?](#what-is-document-sharding)
|
|
- [When to Use Sharding](#when-to-use-sharding)
|
|
- [How Sharding Works](#how-sharding-works)
|
|
- [Using the Shard-Doc Tool](#using-the-shard-doc-tool)
|
|
- [Workflow Support](#workflow-support)
|
|
- [Best Practices](#best-practices)
|
|
- [Examples](#examples)
|
|
|
|
## What is Document Sharding?
|
|
|
|
Document sharding splits large markdown files into smaller, organized files based on level 2 headings (`## Heading`). This enables:
|
|
|
|
- **Selective Loading** - Workflows load only the sections they need
|
|
- **Reduced Token Usage** - Massive efficiency gains for large projects
|
|
- **Better Organization** - Logical section-based file structure
|
|
- **Maintained Context** - Index file preserves document structure
|
|
|
|
### Architecture
|
|
|
|
```
|
|
Before Sharding:
|
|
docs/
|
|
└── PRD.md (large 50k token file)
|
|
|
|
After Sharding:
|
|
docs/
|
|
└── prd/
|
|
├── index.md # Table of contents with descriptions
|
|
├── overview.md # Section 1
|
|
├── user-requirements.md # Section 2
|
|
├── technical-requirements.md # Section 3
|
|
└── ... # Additional sections
|
|
```
|
|
|
|
## When to Use Sharding
|
|
|
|
### Ideal Candidates
|
|
|
|
**Large Multi-Epic Projects:**
|
|
|
|
- Very large complex PRDs
|
|
- Architecture documents with multiple system layers
|
|
- Epic files with 4+ epics (especially for Phase 4)
|
|
- UX design specs covering multiple subsystems
|
|
|
|
**Token Thresholds:**
|
|
|
|
- **Consider sharding**: Documents > 20k tokens
|
|
- **Strongly recommended**: Documents > 40k tokens
|
|
- **Critical for efficiency**: Documents > 60k tokens
|
|
|
|
### When NOT to Shard
|
|
|
|
**Small Projects:**
|
|
|
|
- Single epic projects
|
|
- Level 0-1 projects (tech-spec only)
|
|
- Documents under 10k tokens
|
|
- Quick prototypes
|
|
|
|
**Frequently Updated Docs:**
|
|
|
|
- Active work-in-progress documents
|
|
- Documents updated daily
|
|
- Documents where whole-file context is essential
|
|
|
|
## How Sharding Works
|
|
|
|
### Sharding Process
|
|
|
|
1. **Tool Execution**: Run `npx @kayvan/markdown-tree-parser source.md destination/` - this is abstracted with the core shard-doc task which is installed as a slash command or manual task rule depending on your tools.
|
|
2. **Section Extraction**: Tool splits by level 2 headings
|
|
3. **File Creation**: Each section becomes a separate file
|
|
4. **Index Generation**: `index.md` created with structure and descriptions
|
|
|
|
### Workflow Discovery
|
|
|
|
BMad workflows use a **dual discovery system**:
|
|
|
|
1. **Try whole document first** - Look for `document-name.md`
|
|
2. **Check for sharded version** - Look for `document-name/index.md`
|
|
3. **Priority rule** - Whole document takes precedence if both exist
|
|
|
|
### Loading Strategies
|
|
|
|
**Full Load (Phase 1-3 workflows):**
|
|
|
|
```
|
|
If sharded:
|
|
- Read index.md
|
|
- Read ALL section files
|
|
- Treat as single combined document
|
|
```
|
|
|
|
**Selective Load (Phase 4 workflows):**
|
|
|
|
```
|
|
If sharded epics and working on Epic 3:
|
|
- Read epics/index.md
|
|
- Load ONLY epics/epic-3.md
|
|
- Skip all other epic files
|
|
- 90%+ token savings!
|
|
```
|
|
|
|
## Using the Shard-Doc Tool
|
|
|
|
### CLI Command
|
|
|
|
```bash
|
|
# Activate bmad-master or analyst agent, then:
|
|
/shard-doc
|
|
```
|
|
|
|
### Interactive Process
|
|
|
|
```
|
|
Agent: Which document would you like to shard?
|
|
User: docs/PRD.md
|
|
|
|
Agent: Default destination: docs/prd/
|
|
Accept default? [y/n]
|
|
User: y
|
|
|
|
Agent: Sharding PRD.md...
|
|
✓ Created 12 section files
|
|
✓ Generated index.md
|
|
✓ Complete!
|
|
```
|
|
|
|
### What Gets Created
|
|
|
|
**index.md structure:**
|
|
|
|
```markdown
|
|
# PRD - Index
|
|
|
|
## Sections
|
|
|
|
1. [Overview](./overview.md) - Project vision and objectives
|
|
2. [User Requirements](./user-requirements.md) - Feature specifications
|
|
3. [Epic 1: Authentication](./epic-1-authentication.md) - User auth system
|
|
4. [Epic 2: Dashboard](./epic-2-dashboard.md) - Main dashboard UI
|
|
...
|
|
```
|
|
|
|
**Individual section files:**
|
|
|
|
- Named from heading text (kebab-case)
|
|
- Contains complete section content
|
|
- Preserves all markdown formatting
|
|
- Can be read independently
|
|
|
|
## Workflow Support
|
|
|
|
### Universal Support
|
|
|
|
**All BMM workflows support both formats:**
|
|
|
|
- ✅ Whole documents
|
|
- ✅ Sharded documents
|
|
- ✅ Automatic detection
|
|
- ✅ Transparent to user
|
|
|
|
### Workflow-Specific Patterns
|
|
|
|
#### Phase 1-3 (Full Load)
|
|
|
|
Workflows load entire sharded documents:
|
|
|
|
- `product-brief` - Research, brainstorming docs
|
|
- `prd` - Product brief, research
|
|
- `gdd` - Game brief, research
|
|
- `create-ux-design` - PRD, brief, epics
|
|
- `tech-spec` - Brief, research
|
|
- `architecture` - PRD, epics, UX design
|
|
- `solutioning-gate-check` - All planning docs
|
|
|
|
#### Phase 4 (Selective Load)
|
|
|
|
Workflows load only needed sections:
|
|
|
|
**sprint-planning** (Full Load):
|
|
|
|
- Needs ALL epics to build complete status
|
|
|
|
**epic-tech-context, create-story, story-context, code-review** (Selective):
|
|
|
|
```
|
|
Working on Epic 3, Story 2:
|
|
✓ Load epics/epic-3.md only
|
|
✗ Skip epics/epic-1.md, epic-2.md, epic-4.md, etc.
|
|
|
|
Result: 90%+ token reduction for 10-epic projects!
|
|
```
|
|
|
|
### Input File Patterns
|
|
|
|
Workflows use standardized patterns:
|
|
|
|
```yaml
|
|
input_file_patterns:
|
|
prd:
|
|
whole: '{output_folder}/*prd*.md'
|
|
sharded: '{output_folder}/*prd*/index.md'
|
|
|
|
epics:
|
|
whole: '{output_folder}/*epic*.md'
|
|
sharded_index: '{output_folder}/*epic*/index.md'
|
|
sharded_single: '{output_folder}/*epic*/epic-{{epic_num}}.md'
|
|
```
|
|
|
|
## Best Practices
|
|
|
|
### Sharding Strategy
|
|
|
|
**Do:**
|
|
|
|
- ✅ Shard after planning phase complete
|
|
- ✅ Keep level 2 headings well-organized
|
|
- ✅ Use descriptive section names
|
|
- ✅ Shard before Phase 4 implementation
|
|
- ✅ Keep original file as backup initially
|
|
|
|
**Don't:**
|
|
|
|
- ❌ Shard work-in-progress documents
|
|
- ❌ Shard small documents (<20k tokens)
|
|
- ❌ Mix sharded and whole versions
|
|
- ❌ Manually edit index.md structure
|
|
|
|
### Naming Conventions
|
|
|
|
**Good Section Names:**
|
|
|
|
```markdown
|
|
## Epic 1: User Authentication
|
|
|
|
## Technical Requirements
|
|
|
|
## System Architecture
|
|
|
|
## UX Design Principles
|
|
```
|
|
|
|
**Poor Section Names:**
|
|
|
|
```markdown
|
|
## Section 1
|
|
|
|
## Part A
|
|
|
|
## Details
|
|
|
|
## More Info
|
|
```
|
|
|
|
### File Management
|
|
|
|
**When to Re-shard:**
|
|
|
|
- Significant structural changes to document
|
|
- Adding/removing major sections
|
|
- After major refactoring
|
|
|
|
**Updating Sharded Docs:**
|
|
|
|
1. Edit individual section files directly
|
|
2. OR edit original, delete sharded folder, re-shard
|
|
3. Don't manually edit index.md
|
|
|
|
## Examples
|
|
|
|
### Example 1: Large PRD
|
|
|
|
**Scenario:** 15-epic project, PRD is 45k tokens
|
|
|
|
**Before Sharding:**
|
|
|
|
```
|
|
Every workflow loads entire 45k token PRD
|
|
Epic-tech-context for Epic 3: 45k tokens
|
|
Create-story for Epic 3: 45k tokens
|
|
```
|
|
|
|
**After Sharding:**
|
|
|
|
```bash
|
|
/shard-doc
|
|
Source: docs/PRD.md
|
|
Destination: docs/prd/
|
|
|
|
Created:
|
|
prd/index.md
|
|
prd/overview.md (3k tokens)
|
|
prd/epic-1-auth.md (3k tokens)
|
|
prd/epic-2-dashboard.md (3k tokens)
|
|
prd/epic-3-reports.md (3k tokens)
|
|
...15 epic files
|
|
```
|
|
|
|
**Result:**
|
|
|
|
```
|
|
Epic-tech-context for Epic 3: 3k tokens (93% reduction!)
|
|
Create-story for Epic 3: 3k tokens (93% reduction!)
|
|
```
|
|
|
|
### Example 2: Sharding Epics File
|
|
|
|
**Scenario:** 8 epics with detailed stories, 35k tokens total
|
|
|
|
```bash
|
|
/shard-doc
|
|
Source: docs/bmm-epics.md
|
|
Destination: docs/epics/
|
|
|
|
Created:
|
|
epics/index.md
|
|
epics/epic-1.md
|
|
epics/epic-2.md
|
|
...
|
|
epics/epic-8.md
|
|
```
|
|
|
|
**Efficiency Gain:**
|
|
|
|
```
|
|
Working on Epic 5 stories:
|
|
Old: Load all 8 epics (35k tokens)
|
|
New: Load epic-5.md only (4k tokens)
|
|
Savings: 88% reduction
|
|
```
|
|
|
|
### Example 3: Architecture Document
|
|
|
|
**Scenario:** Multi-layer system architecture, 28k tokens
|
|
|
|
```bash
|
|
/shard-doc
|
|
Source: docs/architecture.md
|
|
Destination: docs/architecture/
|
|
|
|
Created:
|
|
architecture/index.md
|
|
architecture/system-overview.md
|
|
architecture/frontend-architecture.md
|
|
architecture/backend-services.md
|
|
architecture/data-layer.md
|
|
architecture/infrastructure.md
|
|
architecture/security-architecture.md
|
|
```
|
|
|
|
**Benefit:** Code-review workflow can reference specific architectural layers without loading entire architecture doc.
|
|
|
|
## Custom Workflow Integration
|
|
|
|
### For Workflow Builders
|
|
|
|
When creating custom workflows that load large documents:
|
|
|
|
**1. Add input_file_patterns to workflow.yaml:**
|
|
|
|
```yaml
|
|
input_file_patterns:
|
|
your_document:
|
|
whole: '{output_folder}/*your-doc*.md'
|
|
sharded: '{output_folder}/*your-doc*/index.md'
|
|
```
|
|
|
|
**2. Add discovery instructions to instructions.md:**
|
|
|
|
```markdown
|
|
## Document Discovery
|
|
|
|
1. Search for whole document: _your-doc_.md
|
|
2. Check for sharded version: _your-doc_/index.md
|
|
3. If sharded: Read index + ALL sections (or specific sections if selective load)
|
|
4. Priority: Whole document first
|
|
```
|
|
|
|
**3. Choose loading strategy:**
|
|
|
|
- **Full Load**: Read all sections when sharded
|
|
- **Selective Load**: Read only relevant sections (requires section identification logic)
|
|
|
|
### Pattern Templates
|
|
|
|
**Full Load Pattern:**
|
|
|
|
```xml
|
|
<action>Search for document: {output_folder}/*doc-name*.md</action>
|
|
<action>If not found, check for sharded: {output_folder}/*doc-name*/index.md</action>
|
|
<action if="sharded found">Read index.md to understand structure</action>
|
|
<action if="sharded found">Read ALL section files listed in index</action>
|
|
<action if="sharded found">Combine content as single document</action>
|
|
```
|
|
|
|
**Selective Load Pattern (with section ID):**
|
|
|
|
```xml
|
|
<action>Determine section needed (e.g., epic_num = 3)</action>
|
|
<action>Check for sharded version: {output_folder}/*doc-name*/index.md</action>
|
|
<action if="sharded found">Read ONLY the specific section file needed</action>
|
|
<action if="sharded found">Skip all other section files</action>
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
|
|
**Both whole and sharded exist:**
|
|
|
|
- Workflows will use whole document (priority rule)
|
|
- Delete or archive the one you don't want
|
|
|
|
**Index.md out of sync:**
|
|
|
|
- Delete sharded folder
|
|
- Re-run shard-doc on original
|
|
|
|
**Workflow can't find document:**
|
|
|
|
- Check file naming matches patterns (`*prd*.md`, `*epic*.md`, etc.)
|
|
- Verify index.md exists in sharded folder
|
|
- Check output_folder path in config
|
|
|
|
**Sections too granular:**
|
|
|
|
- Combine sections in original document
|
|
- Use fewer level 2 headings
|
|
- Re-shard
|
|
|
|
## Related Documentation
|
|
|
|
- [shard-doc Tool](../src/core/tools/shard-doc.xml) - Tool implementation
|
|
- [BMM Workflows Guide](../src/modules/bmm/workflows/README.md) - Workflow overview
|
|
- [Workflow Creation Guide](../src/modules/bmb/workflows/create-workflow/workflow-creation-guide.md) - Custom workflow patterns
|
|
|
|
---
|
|
|
|
**Document sharding is optional but powerful** - use it when efficiency matters for large projects!
|