Merged claudedocs/ into docs/research/ for consistent documentation structure. Changes: - Moved all claudedocs/*.md files to docs/research/ - Updated all path references in documentation (EN/KR) - Updated RULES.md and research.md command templates - Removed claudedocs/ directory - Removed ClaudeDocs/ from .gitignore Benefits: - Single source of truth for all research reports - PEP8-compliant lowercase directory naming - Clearer documentation organization - Prevents future claudedocs/ directory creation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
17 KiB
Repository-Scoped Memory Management for AI Coding Assistants
Research Report | 2025-10-16
Executive Summary
This research investigates best practices for implementing repository-scoped memory management in AI coding assistants, with specific focus on SuperClaude PM Agent integration. Key findings indicate that local file storage with git repository detection is the industry standard for session isolation, offering optimal performance and developer experience.
Key Recommendations for SuperClaude
- ✅ Adopt Local File Storage: Store memory in repository-specific directories (
.superclaude/memory/ordocs/memory/) - ✅ Use Git Detection: Implement
git rev-parse --git-dirfor repository boundary detection - ✅ Prioritize Simplicity: Start with file-based approach before considering databases
- ✅ Maintain Backward Compatibility: Support future cross-repository intelligence as optional feature
1. Industry Best Practices
1.1 Cursor IDE Memory Architecture
Implementation Pattern:
project-root/
├── .cursor/
│ └── rules/ # Project-specific configuration
├── .git/ # Repository boundary marker
└── memory-bank/ # Session context storage
├── project_context.md
├── progress_history.md
└── architectural_decisions.md
Key Insights:
- Repository-level isolation using
.cursor/rulesdirectory - Memory Bank pattern: structured knowledge repository for cross-session context
- MCP integration (Graphiti) for sophisticated memory management across sessions
- Problem: Users report context loss mid-task and excessive "start new chat" prompts
Relevance to SuperClaude: Validates local directory approach with repository-scoped configuration.
1.2 GitHub Copilot Workspace Context
Implementation Pattern:
- Remote code search indexes for GitHub/Azure DevOps repositories
- Local indexes for non-cloud repositories (limit: 2,500 files)
- Respects
.gitignorefor index exclusion - Workspace-level context with repository-specific boundaries
Key Insights:
- Automatic index building for GitHub-backed repos
.gitignoreintegration prevents sensitive data indexing- Repository authorization through GitHub App permissions
- Limitation: Context scope is workspace-wide, not repository-specific by default
Relevance to SuperClaude: .gitignore integration is critical for security and performance.
1.3 Session Isolation Best Practices
Git Worktrees for Parallel Sessions:
# Enable multiple isolated Claude sessions
git worktree add ../feature-branch feature-branch
# Each worktree has independent working directory, shared git history
Context Window Management:
- Long sessions lead to context pollution → performance degradation
- Best Practice: Use
/clearcommand between tasks - Create session-end context files (
GEMINI.md,CONTEXT.md) for handoff - Break tasks into smaller, isolated chunks
Enterprise Security Architecture (4-Layer Defense):
- Prevention: Rate-limit access, auto-strip credentials
- Protection: Encryption, project-level role-based access control
- Detection: SAST/DAST/SCA on pull requests
- Response: Detailed commit-prompt mapping
Relevance to SuperClaude: PM Agent should implement context reset between repository changes.
2. Git Repository Detection Patterns
2.1 Standard Detection Methods
Recommended Approach:
# Detect if current directory is in git repository
git rev-parse --git-dir
# Check if inside working tree
git rev-parse --is-inside-work-tree
# Get repository root
git rev-parse --show-toplevel
Implementation Considerations:
- Git searches parent directories for
.gitfolder automatically libgit2library recommended for programmatic access- Avoid direct
.gitfolder parsing (fragile to git internals changes)
2.2 Security Concerns
- Issue: Millions of
.gitfolders exposed publicly by misconfiguration - Mitigation: Always respect
.gitignoreand add.superclaude/to ignore patterns - Best Practice: Store sensitive memory data in gitignored directories
3. Storage Architecture Comparison
3.1 Local File Storage
Advantages:
- ✅ Performance: Faster than databases for sequential reads
- ✅ Simplicity: No database setup or maintenance
- ✅ Portability: Works offline, no network dependencies
- ✅ Developer-Friendly: Files are readable/editable by humans
- ✅ Git Integration: Can be versioned (if desired) or gitignored
Disadvantages:
- ❌ No ACID transactions
- ❌ Limited query capabilities
- ❌ Manual concurrency handling
Use Cases:
- Perfect for: Session context, architectural decisions, project documentation
- Not ideal for: High-concurrency writes, complex queries
3.2 Database Storage
Advantages:
- ✅ ACID transactions
- ✅ Complex queries (SQL)
- ✅ Concurrency management
- ✅ Scalability for cross-repository intelligence (future)
Disadvantages:
- ❌ Performance: Slower than local files for simple reads
- ❌ Complexity: Database setup and maintenance overhead
- ❌ Network Bottlenecks: If using remote database
- ❌ Developer UX: Requires database tools to inspect
Use Cases:
- Future feature: Cross-repository pattern mining
- Not needed for: Basic repository-scoped memory
3.3 Vector Databases (Advanced)
Recommendation: Not needed for v1
Future Consideration:
- Semantic search across project history
- Pattern recognition across repositories
- Requires significant infrastructure investment
- Wait until: SuperClaude reaches "super-intelligence" level
4. SuperClaude PM Agent Recommendations
4.1 Immediate Implementation (v1)
Architecture:
project-root/
├── .git/ # Repository boundary
├── .gitignore
│ └── .superclaude/ # Add to gitignore
├── .superclaude/
│ └── memory/
│ ├── session_state.json # Current session context
│ ├── pm_context.json # PM Agent PDCA state
│ └── decisions/ # Architectural decision records
│ ├── 2025-10-16_auth.md
│ └── 2025-10-15_db.md
└── docs/
└── superclaude/ # Human-readable documentation
├── patterns/ # Successful patterns
└── mistakes/ # Error prevention
Detection Logic:
import subprocess
from pathlib import Path
def get_repository_root() -> Path | None:
"""Detect git repository root using git rev-parse."""
try:
result = subprocess.run(
["git", "rev-parse", "--show-toplevel"],
capture_output=True,
text=True,
timeout=5
)
if result.returncode == 0:
return Path(result.stdout.strip())
except (subprocess.TimeoutExpired, FileNotFoundError):
pass
return None
def get_memory_dir() -> Path:
"""Get repository-scoped memory directory."""
repo_root = get_repository_root()
if repo_root:
memory_dir = repo_root / ".superclaude" / "memory"
memory_dir.mkdir(parents=True, exist_ok=True)
return memory_dir
else:
# Fallback to global memory if not in git repo
return Path.home() / ".superclaude" / "memory" / "global"
Session Lifecycle Integration:
# Session Start
def restore_session_context():
repo_root = get_repository_root()
if not repo_root:
return {} # No repository context
memory_file = repo_root / ".superclaude" / "memory" / "pm_context.json"
if memory_file.exists():
return json.loads(memory_file.read_text())
return {}
# Session End
def save_session_context(context: dict):
repo_root = get_repository_root()
if not repo_root:
return # Don't save if not in repository
memory_file = repo_root / ".superclaude" / "memory" / "pm_context.json"
memory_file.parent.mkdir(parents=True, exist_ok=True)
memory_file.write_text(json.dumps(context, indent=2))
4.2 PM Agent Memory Management
PDCA Cycle Integration:
# Plan Phase
write_memory(repo_root / ".superclaude/memory/plan.json", {
"hypothesis": "...",
"success_criteria": "...",
"risks": [...]
})
# Do Phase
write_memory(repo_root / ".superclaude/memory/experiment.json", {
"trials": [...],
"errors": [...],
"solutions": [...]
})
# Check Phase
write_memory(repo_root / ".superclaude/memory/evaluation.json", {
"outcomes": {...},
"adherence_check": "...",
"completion_status": "..."
})
# Act Phase
if success:
move_to_patterns(repo_root / "docs/superclaude/patterns/pattern-name.md")
else:
move_to_mistakes(repo_root / "docs/superclaude/mistakes/mistake-YYYY-MM-DD.md")
4.3 Context Isolation Strategy
Problem: User switches from SuperClaude_Framework to airis-mcp-gateway
Current Behavior: PM Agent retains SuperClaude context → Noise
Desired Behavior: PM Agent detects repository change → Clears context → Loads airis-mcp-gateway context
Implementation:
class RepositoryContextManager:
def __init__(self):
self.current_repo = None
self.context = {}
def check_repository_change(self):
"""Detect if repository changed since last invocation."""
new_repo = get_repository_root()
if new_repo != self.current_repo:
# Repository changed - clear context
if self.current_repo:
self.save_context(self.current_repo)
self.current_repo = new_repo
self.context = self.load_context(new_repo) if new_repo else {}
return True # Context cleared
return False # Same repository
def load_context(self, repo_root: Path) -> dict:
"""Load repository-specific context."""
memory_file = repo_root / ".superclaude" / "memory" / "pm_context.json"
if memory_file.exists():
return json.loads(memory_file.read_text())
return {}
def save_context(self, repo_root: Path):
"""Save current context to repository."""
if not repo_root:
return
memory_file = repo_root / ".superclaude" / "memory" / "pm_context.json"
memory_file.parent.mkdir(parents=True, exist_ok=True)
memory_file.write_text(json.dumps(self.context, indent=2))
Usage in PM Agent:
# Session Start Protocol
context_mgr = RepositoryContextManager()
if context_mgr.check_repository_change():
print(f"📍 Repository: {context_mgr.current_repo.name}")
print(f"前回: {context_mgr.context.get('last_session', 'No previous session')}")
print(f"進捗: {context_mgr.context.get('progress', 'Starting fresh')}")
4.4 .gitignore Integration
Add to .gitignore:
# SuperClaude Memory (session-specific, not for version control)
.superclaude/memory/
# Keep architectural decisions (optional - can be versioned)
# !.superclaude/memory/decisions/
Rationale:
- Session state changes frequently → should not be committed
- Architectural decisions MAY be versioned (team decision)
- Prevents accidental secret exposure in memory files
5. Future Enhancements (v2+)
5.1 Cross-Repository Intelligence
When to implement: After PM Agent demonstrates reliable single-repository context
Architecture:
~/.superclaude/
└── global_memory/
├── patterns/ # Cross-repo patterns
│ ├── authentication.json
│ └── testing.json
└── repo_index/ # Repository metadata
├── SuperClaude_Framework.json
└── airis-mcp-gateway.json
Smart Context Selection:
def get_relevant_context(current_repo: str) -> dict:
"""Select context based on current repository."""
# Local context (high priority)
local = load_local_context(current_repo)
# Global patterns (low priority, filtered by relevance)
global_patterns = load_global_patterns()
relevant = filter_by_similarity(global_patterns, local.get('tech_stack'))
return merge_contexts(local, relevant, priority="local")
5.2 Vector Database Integration
When to implement: If SuperClaude requires semantic search across 100+ repositories
Use Case:
- "Find all authentication implementations across my projects"
- "What error handling patterns have I used successfully?"
Technology: pgvector, Qdrant, or Pinecone
Cost-Benefit: High complexity, only justified for "super-intelligence" tier features
6. Implementation Roadmap
Phase 1: Repository-Scoped File Storage (Immediate)
Timeline: 1-2 weeks Effort: Low
- Implement
get_repository_root()detection - Create
.superclaude/memory/directory structure - Integrate with PM Agent session lifecycle
- Add
.superclaude/memory/to.gitignore - Test repository change detection
Success Criteria:
- ✅ PM Agent context isolated per repository
- ✅ No noise from other projects
- ✅ Session resumes correctly within same repository
Phase 2: PDCA Memory Integration (Short-term)
Timeline: 2-3 weeks Effort: Medium
- Integrate Plan/Do/Check/Act with file storage
- Implement
docs/superclaude/patterns/anddocs/superclaude/mistakes/ - Create ADR (Architectural Decision Records) format
- Add 7-day cleanup for
docs/temp/
Success Criteria:
- ✅ Successful patterns documented automatically
- ✅ Mistakes recorded with prevention checklists
- ✅ Knowledge accumulates within repository
Phase 3: Cross-Repository Patterns (Future)
Timeline: 3-6 months Effort: High
- Implement global pattern database
- Smart context filtering by tech stack
- Pattern similarity scoring
- Opt-in cross-repo intelligence
Success Criteria:
- ✅ PM Agent learns from past projects
- ✅ Suggests relevant patterns from other repos
- ✅ No performance degradation
7. Comparison Matrix
| Feature | Local Files | Database | Vector DB |
|---|---|---|---|
| Performance | ⭐⭐⭐⭐⭐ Fast | ⭐⭐⭐ Medium | ⭐⭐ Slow (network) |
| Simplicity | ⭐⭐⭐⭐⭐ Simple | ⭐⭐ Complex | ⭐ Very Complex |
| Setup Time | Minutes | Hours | Days |
| ACID Transactions | ❌ No | ✅ Yes | ✅ Yes |
| Query Capabilities | ⭐⭐ Basic | ⭐⭐⭐⭐⭐ SQL | ⭐⭐⭐⭐ Semantic |
| Offline Support | ✅ Yes | ⚠️ Depends | ❌ No |
| Developer UX | ⭐⭐⭐⭐⭐ Excellent | ⭐⭐⭐ Good | ⭐⭐ Fair |
| Maintenance | ⭐⭐⭐⭐⭐ None | ⭐⭐⭐ Regular | ⭐⭐ Intensive |
Recommendation for SuperClaude v1: Local Files (clear winner for repository-scoped memory)
8. Security Considerations
8.1 Sensitive Data Handling
Problem: Memory files may contain secrets, API keys, internal URLs Solution: Automatic redaction + gitignore
import re
SENSITIVE_PATTERNS = [
r'sk_live_[a-zA-Z0-9]{24,}', # Stripe keys
r'eyJ[a-zA-Z0-9_-]*\.[a-zA-Z0-9_-]*', # JWT tokens
r'ghp_[a-zA-Z0-9]{36}', # GitHub tokens
]
def redact_sensitive_data(text: str) -> str:
"""Remove sensitive data before storing in memory."""
for pattern in SENSITIVE_PATTERNS:
text = re.sub(pattern, '[REDACTED]', text)
return text
8.2 .gitignore Best Practices
Always gitignore:
.superclaude/memory/(session state).superclaude/temp/(temporary files)
Optional versioning (team decision):
.superclaude/memory/decisions/(ADRs)docs/superclaude/patterns/(successful patterns)
9. Conclusion
Key Takeaways
- ✅ Local File Storage is Optimal: Industry standard for repository-scoped context
- ✅ Git Detection is Standard: Use
git rev-parse --show-toplevel - ✅ Start Simple, Evolve Later: Files → Database (if needed) → Vector DB (far future)
- ✅ Repository Isolation is Critical: Prevents context noise across projects
Recommended Architecture for SuperClaude
SuperClaude_Framework/
├── .git/
├── .gitignore (+.superclaude/memory/)
├── .superclaude/
│ └── memory/
│ ├── pm_context.json # Current session state
│ ├── plan.json # PDCA Plan phase
│ ├── experiment.json # PDCA Do phase
│ └── evaluation.json # PDCA Check phase
└── docs/
└── superclaude/
├── patterns/ # Successful implementations
│ └── authentication-jwt.md
└── mistakes/ # Error prevention
└── mistake-2025-10-16.md
Next Steps:
- Implement
RepositoryContextManagerclass - Integrate with PM Agent session lifecycle
- Add
.superclaude/memory/to.gitignore - Test with repository switching scenarios
- Document for team adoption
Research Confidence: High (based on industry standards from Cursor, GitHub Copilot, and security best practices)
Sources:
- Cursor IDE memory management architecture
- GitHub Copilot workspace context documentation
- Enterprise AI security frameworks
- Git repository detection patterns
- Storage performance benchmarks
Last Updated: 2025-10-16 Next Review: After Phase 1 implementation (2-3 weeks)