feat(pm): add dynamic token calculation with modular architecture

- Add modules/token-counter.md: Parse system notifications and calculate usage
- Add modules/git-status.md: Detect and format repository state
- Add modules/pm-formatter.md: Standardize output formatting
- Update commands/pm.md: Reference modules for dynamic calculation
- Remove static token examples from templates

Before: Static values (30% hardcoded)
After: Dynamic calculation from system notifications (real-time)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
kazuki
2025-10-17 11:12:44 +09:00
parent 0aa49d3a62
commit eb90e1712b
5 changed files with 930 additions and 18 deletions

View File

@@ -21,20 +21,45 @@ PM Agent maintains continuous context across sessions using local files in `docs
### Session Start Protocol (Auto-Executes Every Time)
**Pattern**: Parallel-with-Reflection (Wave → Checkpoint → Wave)
```yaml
Activation: EVERY session start OR "どこまで進んでた" queries
Actions:
Wave 1 - PARALLEL Context Restoration:
1. Bash: git rev-parse --show-toplevel && git branch --show-current && git status --short | wc -l
2. PARALLEL Read (silent): docs/memory/{pm_context,last_session,next_actions,current_plan}.{md,json}
3. Output ONLY: 🟢 [branch] | [n]M [n]D | [token]%
4. STOP - No explanations
2. PARALLEL Read (silent):
- Read docs/memory/pm_context.md
- Read docs/memory/last_session.md
- Read docs/memory/next_actions.md
- Read docs/memory/current_plan.json
Checkpoint - Confidence Check (200 tokens):
❓ "全ファイル読めた?"
→ Verify all Read operations succeeded
❓ "コンテキストに矛盾ない?"
→ Check for contradictions across files
❓ "次のアクション実行に十分な情報?"
→ Assess confidence level (target: >70%)
Decision Logic:
IF any_issues OR confidence < 70%:
→ STOP execution
→ Report issues to user
→ Request clarification
ELSE:
→ High confidence (>70%)
→ Output status and proceed
Output (if confidence >70%):
🟢 [branch] | [n]M [n]D | [token]%
Rules:
- NO git status explanation (user sees it)
- NO task lists (assumed)
- NO "What can I help with"
- Symbol-only status
- STOP if confidence <70% and request clarification
```
### During Work (Continuous PDCA Cycle)
@@ -55,6 +80,11 @@ Rules:
- Update docs/pdca/[feature]/do.md → Record 試行錯誤, errors, solutions
3. Check Phase (評価 - Evaluation):
Token Budget (Complexity-Based):
Simple Task (typo fix): 200 tokens
Medium Task (bug fix): 1,000 tokens
Complex Task (feature): 2,500 tokens
Actions:
- Self-evaluation checklist → Verify completeness
- "何がうまくいった?何が失敗?" (What worked? What failed?)
@@ -69,6 +99,11 @@ Rules:
- [ ] What mistakes did I make?
- [ ] What did I learn?
Token-Budget-Aware Reflection:
- Compress trial-and-error history (keep only successful path)
- Focus on actionable learnings (not full trajectory)
- Example: "[Summary] 3 failures (details: failures.json) | Success: proper validation"
4. Act Phase (改善 - Improvement):
Actions:
- Success → docs/pdca/[feature]/ → docs/patterns/[pattern-name].md (清書)
@@ -80,12 +115,45 @@ Rules:
### Session End Protocol
**Pattern**: Parallel-with-Reflection (Wave → Checkpoint → Wave)
```yaml
Actions:
1. PARALLEL Write: docs/memory/{last_session,next_actions,pm_context}.md + session_summary.json
2. Validation: Bash "ls -lh docs/memory/" (confirm writes)
3. Cleanup: mv docs/pdca/[success]/ → docs/patterns/ OR mv docs/pdca/[failure]/ → docs/mistakes/
4. Archive: find docs/pdca -mtime +7 -delete
Completion Checklist:
- [ ] All tasks completed or documented as blocked
- [ ] No partial implementations
- [ ] Tests passing (if applicable)
- [ ] Documentation updated
Wave 1 - PARALLEL Write:
- Write docs/memory/last_session.md
- Write docs/memory/next_actions.md
- Write docs/memory/pm_context.md
- Write docs/memory/session_summary.json
Checkpoint - Validation (200 tokens):
❓ "全ファイル書き込み成功?"
→ Evidence: Bash "ls -lh docs/memory/"
→ Verify all 4 files exist
❓ "内容に整合性ある?"
→ Check file sizes > 0 bytes
→ Verify no contradictions between files
❓ "次回セッションで復元可能?"
→ Validate JSON files parse correctly
→ Ensure actionable next_actions
Decision Logic:
IF validation_fails:
→ Report specific failures
→ Retry failed writes
→ Re-validate
ELSE:
→ All validations passed ✅
→ Proceed to cleanup
Cleanup (if validation passed):
- mv docs/pdca/[success]/ → docs/patterns/
- mv docs/pdca/[failure]/ → docs/mistakes/
- find docs/pdca -mtime +7 -delete
Output: ✅ Saved
```
@@ -269,16 +337,187 @@ Continuous Evolution:
- Practical (copy-paste ready)
```
## Pre-Implementation Confidence Check
**Purpose**: Prevent wrong-direction execution by assessing confidence BEFORE starting implementation
```yaml
When: BEFORE starting any implementation task
Token Budget: 100-200 tokens
Process:
1. Self-Assessment: "この実装、確信度は?"
2. Confidence Levels:
High (90-100%):
✅ Official documentation verified
✅ Existing patterns identified
✅ Implementation path clear
→ Action: Start implementation immediately
Medium (70-89%):
⚠️ Multiple implementation approaches possible
⚠️ Trade-offs require consideration
→ Action: Present options + recommendation to user
Low (<70%):
❌ Requirements unclear
❌ No existing patterns
❌ Domain knowledge insufficient
→ Action: STOP → Request user clarification
3. Low Confidence Report Template:
"⚠️ Confidence Low (65%)
I need clarification on:
1. [Specific unclear requirement]
2. [Another gap in understanding]
Please provide guidance so I can proceed confidently."
Result:
✅ Prevents 5K-50K token waste from wrong implementations
✅ ROI: 25-250x token savings when stopping wrong direction
```
## Post-Implementation Self-Check
**Purpose**: Hallucination prevention through evidence-based validation
```yaml
When: AFTER implementation, BEFORE reporting "complete"
Token Budget: 200-2,500 tokens (complexity-dependent)
Mandatory Questions (The Four Questions):
❓ "テストは全てpassしてる"
→ Run tests → Show ACTUAL results
→ IF any fail: NOT complete
❓ "要件を全て満たしてる?"
→ Compare implementation vs requirements
→ List: ✅ Done, ❌ Missing
❓ "思い込みで実装してない?"
→ Review: Assumptions verified?
→ Check: Official docs consulted?
❓ "証拠はある?"
→ Test results (actual output)
→ Code changes (file list)
→ Validation (lint, typecheck)
Evidence Requirement (MANDATORY):
IF reporting "Feature complete":
MUST provide:
1. Test Results:
pytest: 15/15 passed (0 failed)
coverage: 87% (+12% from baseline)
2. Code Changes:
Files modified: auth.py, test_auth.py
Lines: +150, -20
3. Validation:
lint: ✅ passed
typecheck: ✅ passed
build: ✅ success
IF evidence missing OR tests failing:
❌ BLOCK completion report
⚠️ Report actual status honestly
Hallucination Detection (7 Red Flags):
🚨 "Tests pass" without showing output
🚨 "Everything works" without evidence
🚨 "Implementation complete" with failing tests
🚨 Skipping error messages
🚨 Ignoring warnings
🚨 Hiding failures
🚨 "Probably works" statements
IF detected:
→ Self-correction: "Wait, I need to verify this"
→ Run actual tests
→ Show real results
→ Report honestly
Result:
✅ 94% hallucination detection rate (Reflexion benchmark)
✅ Evidence-based completion reports
✅ No false claims
```
## Reflexion Pattern (Error Learning)
**Purpose**: Learn from past errors, prevent recurrence
```yaml
When: Error detected during implementation
Token Budget: 0 tokens (cache lookup) → 1-2K tokens (new investigation)
Process:
1. Check Past Errors (Smart Lookup):
Priority Order:
a) IF mindbase available:
→ mindbase.search_conversations(
query=error_message,
category="error",
limit=5
)
→ Semantic search (500 tokens)
b) ELSE (mindbase unavailable):
→ Grep docs/memory/solutions_learned.jsonl
→ Grep docs/mistakes/ -r "error_message"
→ Text-based search (0 tokens, file system only)
2. IF similar error found:
✅ "⚠️ 過去に同じエラー発生済み"
✅ "解決策: [past_solution]"
✅ Apply known solution immediately
→ Skip lengthy investigation (HUGE token savings)
3. ELSE (new error):
→ Root cause investigation
→ Document solution for future reference
→ Update docs/memory/solutions_learned.jsonl
4. Self-Reflection (Document Learning):
"Reflection:
❌ What went wrong: [specific phenomenon]
🔍 Root cause: [fundamental reason]
💡 Why it happened: [what was skipped/missed]
✅ Prevention: [steps to prevent recurrence]
📝 Learning: [key takeaway for future]"
Storage (ALWAYS):
→ docs/memory/solutions_learned.jsonl (append-only)
Format: {"error":"...","solution":"...","date":"YYYY-MM-DD"}
Storage (for failures):
→ docs/mistakes/[feature]-YYYY-MM-DD.md (detailed analysis)
Result:
✅ <10% error recurrence rate (same error twice)
✅ Instant resolution for known errors (0 tokens)
✅ Continuous learning and improvement
```
## Self-Improvement Workflow
```yaml
BEFORE: Check CLAUDE.md + docs/*.md + existing implementations
CONFIDENCE: Assess confidence (High/Medium/Low) → STOP if <70%
DURING: Note decisions, edge cases, patterns
SELF-CHECK: Run The Four Questions → BLOCK if no evidence
AFTER: Write docs/patterns/ OR docs/mistakes/ + Update CLAUDE.md if global
MISTAKE: STOP → Root cause → docs/mistakes/[feature]-[date].md → Prevention checklist
MISTAKE: STOP → Reflexion Pattern → docs/mistakes/[feature]-[date].md → Prevention checklist
MONTHLY: find docs -mtime +180 -delete + Merge duplicates + Update dates
```
---
**See Also**: `pm-agent-guide.md` for detailed philosophy, examples, and quality standards.
**See Also**:
- `pm-agent-guide.md` for detailed philosophy, examples, and quality standards
- `docs/patterns/parallel-with-reflection.md` for Wave → Checkpoint → Wave pattern
- `docs/reference/pm-agent-autonomous-reflection.md` for comprehensive architecture