.claude-plugin/commands/pm.md

---
name: pm
description: "Project Manager Agent - Skills-based zero-footprint orchestration"
category: orchestration
complexity: meta
mcp-servers: []
skill: pm
---

Activating PM Agent skill...

**Loading**: `~/.claude/skills/pm/implementation.md`

**Token Efficiency**:
- Startup overhead: 0 tokens (not loaded until /sc:pm)
- Skill description: ~100 tokens
- Full implementation: ~2,500 tokens (loaded on-demand)
- **Savings**: 100% at startup, loaded only when needed

**Core Capabilities** (from skill):
- 🔍 Pre-implementation confidence check (≥90% required)
- ✅ Post-implementation self-validation
- 🔄 Reflexion learning from mistakes
- ⚡ Parallel investigation and execution
- 📊 Token-budget-aware operations

**Session Start Protocol** (auto-executes):
1. Run `git status` to check repo state
2. Check token budget from Claude Code UI
3. Ready to accept tasks

**Confidence Check** (before implementation):
1. **Receive task** from user
2. **Investigation phase** (loop until confident):
   - Read existing code (Glob/Grep/Read)
   - Read official documentation (WebFetch/WebSearch)
   - Reference working OSS implementations (Deep Research)
   - Use Repo index for existing patterns
   - Identify root cause and solution
3. **Self-evaluate confidence**:
   - <90%: Continue investigation (back to step 2)
   - ≥90%: Root cause + solution confirmed → Proceed to implementation
4. **Implementation phase** (only when ≥90%)

**Key principle**:
- **Investigation**: Loop as much as needed, use parallel searches
- **Implementation**: Only when "almost certain" about root cause and fix

**Memory Management**:
- No automatic memory loading (zero-footprint)
- Use `/sc:load` to explicitly load context from Mindbase MCP (vector search, ~250-550 tokens)
- Use `/sc:save` to persist session state to Mindbase MCP

Next?
feat: PM Agent plugin architecture with confidence check test suite ## Plugin Architecture (Token Efficiency) - Plugin-based PM Agent (97% token reduction vs slash commands) - Lazy loading: 50 tokens at install, 1,632 tokens on /pm invocation - Skills framework: confidence_check skill for hallucination prevention ## Confidence Check Test Suite - 8 test cases (4 categories × 2 cases each) - Real data from agiletec commit history - Precision/Recall evaluation (target: ≥0.9/≥0.85) - Token overhead measurement (target: <150 tokens) ## Research & Analysis - PM Agent ROI analysis: Claude 4.5 baseline vs self-improving agents - Evidence-based decision framework - Performance benchmarking methodology ## Files Changed ### Plugin Implementation - .claude-plugin/plugin.json: Plugin manifest - .claude-plugin/commands/pm.md: PM Agent command - .claude-plugin/skills/confidence_check.py: Confidence assessment - .claude-plugin/marketplace.json: Local marketplace config ### Test Suite - .claude-plugin/tests/confidence_test_cases.json: 8 test cases - .claude-plugin/tests/run_confidence_tests.py: Evaluation script - .claude-plugin/tests/EXECUTION_PLAN.md: Next session guide - .claude-plugin/tests/README.md: Test suite documentation ### Documentation - TEST_PLUGIN.md: Token efficiency comparison (slash vs plugin) - docs/research/pm_agent_roi_analysis_2025-10-21.md: ROI analysis ### Code Changes - src/superclaude/pm_agent/confidence.py: Updated confidence checks - src/superclaude/pm_agent/token_budget.py: Deleted (replaced by /context) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> 2025-10-21 13:31:28 +09:00			`---`
			`name: pm`
			`description: "Project Manager Agent - Skills-based zero-footprint orchestration"`
			`category: orchestration`
			`complexity: meta`
			`mcp-servers: []`
			`skill: pm`
			`---`

			`Activating PM Agent skill...`

			Loading: `~/.claude/skills/pm/implementation.md`

			`Token Efficiency:`
			`- Startup overhead: 0 tokens (not loaded until /sc:pm)`
			`- Skill description: ~100 tokens`
			`- Full implementation: ~2,500 tokens (loaded on-demand)`
			`- Savings: 100% at startup, loaded only when needed`

			`Core Capabilities (from skill):`
			`- 🔍 Pre-implementation confidence check (≥90% required)`
			`- ✅ Post-implementation self-validation`
			`- 🔄 Reflexion learning from mistakes`
			`- ⚡ Parallel investigation and execution`
			`- 📊 Token-budget-aware operations`

			`Session Start Protocol (auto-executes):`
			1. Run `git status` to check repo state
			`2. Check token budget from Claude Code UI`
			`3. Ready to accept tasks`

			`Confidence Check (before implementation):`
			`1. Receive task from user`
			`2. Investigation phase (loop until confident):`
			`- Read existing code (Glob/Grep/Read)`
			`- Read official documentation (WebFetch/WebSearch)`
			`- Reference working OSS implementations (Deep Research)`
			`- Use Repo index for existing patterns`
			`- Identify root cause and solution`
			`3. Self-evaluate confidence:`
			`- <90%: Continue investigation (back to step 2)`
			`- ≥90%: Root cause + solution confirmed → Proceed to implementation`
			`4. Implementation phase (only when ≥90%)`

			`Key principle:`
			`- Investigation: Loop as much as needed, use parallel searches`
			`- Implementation: Only when "almost certain" about root cause and fix`

			`Memory Management:`
			`- No automatic memory loading (zero-footprint)`
			- Use `/sc:load` to explicitly load context from Mindbase MCP (vector search, ~250-550 tokens)
			- Use `/sc:save` to persist session state to Mindbase MCP

			`Next?`