mirror of
https://github.com/bmadcode/BMAD-METHOD.git
synced 2025-12-29 16:14:59 +00:00
Add Text-to-Speech Integration via TTS_INJECTION System (#934)
* feat: Add provider-agnostic TTS integration via injection point system
Implements comprehensive Text-to-Speech integration for BMAD agents using a generic
TTS_INJECTION marker system. When AgentVibes (or any compatible TTS provider) is
installed, all BMAD agents can speak their responses with unique AI voices.
## Key Features
**Provider-Agnostic Architecture**
- Uses generic `TTS_INJECTION` markers instead of vendor-specific naming
- Future-proof for multiple TTS providers beyond AgentVibes
- Clean separation - BMAD stays TTS-agnostic, providers handle injection
**Installation Flow**
- BMAD → AgentVibes: TTS instructions injected when AgentVibes detects existing BMAD installation
- AgentVibes → BMAD: TTS instructions injected during BMAD installation when AgentVibes detected
- User must manually create voice assignment file when AgentVibes installs first (documented limitation)
**Party Mode Voice Support**
- Each agent speaks with unique assigned voice in multi-agent discussions
- PM, Architect, Developer, Analyst, UX Designer, etc. - all with distinct voices
**Zero Breaking Changes**
- Fully backward compatible - works without any TTS provider
- `TTS_INJECTION` markers are benign HTML comments if not processed
- No changes to existing agent behavior or non-TTS workflows
## Implementation Details
**Files Modified:**
- `tools/cli/installers/lib/core/installer.js` - TTS injection processing logic
- `tools/cli/lib/ui.js` - AgentVibes detection and installation prompts
- `tools/cli/commands/install.js` - Post-install guidance for AgentVibes setup
- `src/utility/models/fragments/activation-rules.xml` - TTS_INJECTION marker for agents
- `src/core/workflows/party-mode/instructions.md` - TTS_INJECTION marker for party mode
**Injection Point System:**
```xml
<rules>
- ALWAYS communicate in {communication_language}
<!-- TTS_INJECTION:agent-tts -->
- Stay in character until exit selected
</rules>
```
When AgentVibes is detected, the installer replaces this marker with:
```
- When responding to user messages, speak your responses using TTS:
Call: `.claude/hooks/bmad-speak.sh '{agent-id}' '{response-text}'` after each response
IMPORTANT: Use single quotes - do NOT escape special characters like ! or $
```
**Special Character Handling:**
- Explicit guidance to use single quotes without escaping
- Prevents "backslash exclamation" artifacts in speech
**User Experience:**
```
User: "How should we architect this feature?"
Architect: [Text response] + 🔊 [Professional voice explains architecture]
```
Party Mode:
```
PM (John): "I'll focus on user value..." 🔊 [Male professional voice]
UX Designer (Sara): "From a user perspective..." 🔊 [Female voice]
Architect (Marcus): "The technical approach..." 🔊 [Male technical voice]
```
## Testing
**Unit Tests:** ✅ 62/62 passing
- 49/49 schema validation tests
- 13/13 installation component tests
**Integration Testing:**
- ✅ BMAD → AgentVibes (automatic injection)
- ✅ AgentVibes → BMAD (automatic injection)
- ✅ No TTS provider (markers remain as comments)
## Documentation
Comprehensive testing guide created with:
- Both installation scenario walkthroughs
- Verification commands and expected outputs
- Troubleshooting guidance
## Known Limitations
**AgentVibes → BMAD Installation Order:**
When AgentVibes installs first, voice assignment file must be created manually:
```bash
mkdir -p .bmad/_cfg
cat > .bmad/_cfg/agent-voice-map.csv << 'EOF'
agent_id,voice_name
pm,en_US-ryan-high
architect,en_US-danny-low
dev,en_US-joe-medium
EOF
```
This limitation exists to prevent false legacy v4 detection warnings from BMAD installer.
**Recommended:** Install BMAD first, then AgentVibes for automatic voice assignment.
## Related Work
**Companion Implementation:**
- Repository: paulpreibisch/AgentVibes
- Commits: 6 commits implementing injection processing and voice routing
- Features: Retroactive injection, file path extraction, escape stripping
**GitHub Issues:**
- paulpreibisch/AgentVibes#36 - BMAD agent ID support
## Breaking Changes
None. Feature is opt-in and requires separate TTS provider installation.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: Enforce project hooks over global hooks in party mode
before, claude would sometimes favor global agent vibes hooks over project specific
* feat: Automate AgentVibes installer invocation after BMAD install
Instead of showing manual installation instructions, the installer now:
- Prompts "Press Enter to start AgentVibes installer..."
- Automatically runs npx agentvibes@latest install
- Handles errors gracefully with fallback instructions
This provides a seamless installation flow matching the test script's
interactive approach.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* docs: Add automated testing script and guide for PR #934
Added comprehensive testing tools for AgentVibes party mode integration:
- test-bmad-pr.sh: Fully automated installation and verification script
- Interactive mode selection (official PR or custom fork)
- Automatic BMAD CLI setup and linking
- AgentVibes installation with guided prompts
- Built-in verification checks for voice maps and hooks
- Saved configuration for quick re-testing
- TESTING.md: Complete testing documentation
- Quick start with one-line npx command
- Manual installation alternative
- Troubleshooting guide
- Cleanup instructions
Testers can now run a single command to test the full AgentVibes integration
without needing to understand the complex setup process.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: Add shell: true to npx execSync to prevent permission denied error
The execSync call for 'npx agentvibes@latest install' was failing with
'Permission denied' because the shell was trying to execute 'agentvibes@latest'
directly instead of passing it as an argument to npx.
Adding shell: true ensures the command runs in a proper shell context
where npx can correctly interpret the @latest version syntax.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: Remove duplicate AgentVibes installation step from test script
The test script was calling AgentVibes installer twice:
1. BMAD installer now automatically runs AgentVibes (new feature)
2. Test script had a separate Step 6 that also ran AgentVibes
This caused the installer to run twice, with the second call failing
because it was already installed.
Changes:
- Removed redundant Step 6 (AgentVibes installation)
- Updated Step 5 to indicate it includes AgentVibes
- Updated step numbers from 7 to 6 throughout
- Added guidance that AgentVibes runs automatically
Now the flow is cleaner: BMAD installer handles everything!
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: Address bmadcode review - preserve variables and move TTS logic to injection
Fixes requested changes from PR review:
1. Preserve {bmad_folder} variable placeholder
- Changed: {project_root}/.bmad/core/tasks/workflow.xml
- To: {project_root}/{bmad_folder}/core/tasks/workflow.xml
- Allows users to choose custom BMAD folder names during installation
2. Move TTS-specific hook guidance to injection system
- Removed hardcoded hook enforcement from source files
- Added hook guidance to processTTSInjectionPoints() in installer.js
- Now only appears when AgentVibes is installed (via TTS_INJECTION)
3. Maintain TTS-agnostic source architecture
- Source files remain clean of TTS-specific instructions
- TTS details injected at install-time only when needed
- Preserves provider-agnostic design principle
Changes made:
- src/core/workflows/party-mode/instructions.md
- Reverted .bmad to {bmad_folder} variable
- Replaced hardcoded hook guidance with <!-- TTS_INJECTION:party-mode -->
- Removed <note> about play-tts.sh hook location
- tools/cli/installers/lib/core/installer.js
- Added hook enforcement to party-mode injection replacement
- Guidance now injected only when enableAgentVibes is true
Addresses review comments from bmadcode:
- "needs to remain the variable. it will get set in the file at the install destination."
- "items like this we will need to inject if user is using claude and TTS"
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: Change 'claude-code' to 'claude' in test script instructions
The correct command to start Claude is 'claude', not 'claude-code'.
Updated line 362-363 in test-bmad-pr.sh to show the correct command.
* fix: Remove npm link from test script to avoid global namespace pollution
- Removed 'npm link' command that was installing BMAD globally
- Changed 'bmad install' to direct node execution using local clone
- Updated success message to reflect no global installation
This keeps testing fully isolated and prevents conflicts with:
- Existing BMAD installations
- Future official BMAD installs
- Orphaned symlinks when test directory is deleted
The test script now runs completely self-contained without modifying
the user's global npm environment.
---------
Co-authored-by: Claude Code <claude@anthropic.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Paul Preibisch <paul@paulpreibisch.com>
Co-authored-by: Brian <bmadcode@gmail.com>
This commit is contained in:
@@ -2,6 +2,7 @@
|
||||
|
||||
<critical>The workflow execution engine is governed by: {project_root}/{bmad_folder}/core/tasks/workflow.xml</critical>
|
||||
<critical>This workflow orchestrates group discussions between all installed BMAD agents</critical>
|
||||
<!-- TTS_INJECTION:party-mode -->
|
||||
|
||||
<workflow>
|
||||
|
||||
@@ -94,17 +95,36 @@
|
||||
</substep>
|
||||
|
||||
<substep n="3d" goal="Format and Present Responses">
|
||||
<action>Present each agent's contribution clearly:</action>
|
||||
<action>For each agent response, output text THEN trigger their voice:</action>
|
||||
|
||||
<procedure>
|
||||
1. Output the agent's text in format: [Icon Emoji] [Agent Name]: [dialogue]
|
||||
2. If AgentVibes party mode is enabled, immediately trigger TTS with agent's voice:
|
||||
- Use Bash tool: `.claude/hooks/bmad-speak.sh "[Agent Name]" "[dialogue]"`
|
||||
- This speaks the dialogue with the agent's unique voice
|
||||
- Run in background (&) to not block next agent
|
||||
3. Repeat for each agent in the response
|
||||
</procedure>
|
||||
|
||||
<format>
|
||||
[Agent Name]: [Their response in their voice/style]
|
||||
[Icon Emoji] [Agent Name]: [Their response in their voice/style]
|
||||
|
||||
[Another Agent]: [Their response, potentially referencing the first]
|
||||
[Icon Emoji] [Another Agent]: [Their response, potentially referencing the first]
|
||||
|
||||
[Third Agent if selected]: [Their contribution]
|
||||
[Icon Emoji] [Third Agent if selected]: [Their contribution]
|
||||
</format>
|
||||
<example>
|
||||
🏗️ [Winston]: I recommend using microservices for better scalability.
|
||||
[Bash: .claude/hooks/bmad-speak.sh "Winston" "I recommend using microservices for better scalability."]
|
||||
|
||||
📋 [John]: But a monolith would get us to market faster for MVP.
|
||||
[Bash: .claude/hooks/bmad-speak.sh "John" "But a monolith would get us to market faster for MVP."]
|
||||
</example>
|
||||
|
||||
<action>Maintain spacing between agents for readability</action>
|
||||
<action>Preserve each agent's unique voice throughout</action>
|
||||
<action>Always include the agent's icon emoji from the manifest before their name</action>
|
||||
<action>Trigger TTS for each agent immediately after outputting their text</action>
|
||||
|
||||
</substep>
|
||||
|
||||
|
||||
@@ -1,5 +1,6 @@
|
||||
<rules>
|
||||
- ALWAYS communicate in {communication_language} UNLESS contradicted by communication_style
|
||||
<!-- TTS_INJECTION:agent-tts -->
|
||||
- Stay in character until exit selected
|
||||
- Menu triggers use asterisk (*) - NOT markdown, display exactly as shown
|
||||
- Number all lists, use letters for sub-options
|
||||
|
||||
Reference in New Issue
Block a user