Files
SuperClaude/.claude/commands/shared/error-recovery-enhanced.yml
NomenAK bce31d52a8 Initial commit: SuperClaude v4.0.0 configuration framework
- Core configuration files (CLAUDE.md, RULES.md, PERSONAS.md, MCP.md)
- 17 slash commands for specialized workflows
- 25 shared YAML resources for advanced configurations
- Installation script for global deployment
- 9 cognitive personas for specialized thinking modes
- MCP integration patterns for intelligent tool usage
- Token economy and ultracompressed mode support

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-22 14:02:49 +02:00

224 lines
7.3 KiB
YAML

# Enhanced Error Recovery System
## Legend
| Symbol | Meaning | | Abbrev | Meaning |
|--------|---------|---|--------|---------|
| 🔄 | retry/recovery | | err | error |
| ⚠ | warning/caution | | rec | recovery |
| ✅ | success/fixed | | fail | failure |
| 🔧 | repair/fix | | ctx | context |
## Intelligent Retry Strategies
```yaml
Error_Classification:
Transient_Errors:
Network_Timeouts: "MCP server unreachable, API timeouts"
Resource_Busy: "File locked, system overloaded"
Rate_Limits: "API quota exceeded, temporary blocks"
Permanent_Errors:
Syntax_Errors: "Invalid code, malformed input"
Permission_Denied: "Access restrictions, auth failures"
Not_Found: "Missing files, invalid paths"
Context_Errors:
Configuration: "Invalid settings, missing dependencies"
State_Conflicts: "Dirty working tree, merge conflicts"
Version_Mismatch: "Incompatible versions, deprecated APIs"
Retry_Logic:
Exponential_Backoff:
Base_Delay: "1 second"
Max_Delay: "60 seconds"
Max_Attempts: "3 for transient, 1 for permanent"
Jitter: "±25% randomization to avoid thundering herd"
Adaptive_Strategy:
Network_Errors: "Retry w/ longer timeout"
Rate_Limits: "Wait for reset period + retry"
Resource_Busy: "Short delay + retry w/ alternative"
Permanent: "No retry, immediate fallback"
```
## MCP Server Failover
```yaml
Failover_Hierarchy:
Context7_Failure:
Primary: "C7 documentation lookup"
Fallback_1: "WebSearch official docs"
Fallback_2: "Local cache if available"
Fallback_3: "Continue w/ warning + note limitation"
Sequential_Failure:
Primary: "Sequential thinking server"
Fallback_1: "Native step-by-step analysis"
Fallback_2: "Simplified linear approach"
Fallback_3: "Manual breakdown w/ user input"
Magic_Failure:
Primary: "Magic UI component generation"
Fallback_1: "Search existing components in project"
Fallback_2: "Generate basic template manually"
Fallback_3: "Provide implementation guidance"
Puppeteer_Failure:
Primary: "Browser automation & testing"
Fallback_1: "Manual test instructions"
Fallback_2: "Static analysis alternatives"
Fallback_3: "Skip browser-specific operations"
Server_Health_Monitoring:
Availability_Check:
Frequency: "Every 5 minutes during active use"
Timeout: "3 seconds per check"
Circuit_Breaker: "Disable after 3 consecutive failures"
Recovery_Check: "Re-enable after 5 minutes"
Performance_Degradation:
Slow_Response: ">30s response time"
Action: "Switch to faster alternative if available"
Notification: "Inform user of performance impact"
```
## Context Preservation
```yaml
Operation_Checkpoints:
Before_Risky_Operations:
Create_Checkpoint: "Save current state before destructive ops"
Include: "File states, working directory, command context"
Location: ".claudedocs/checkpoints/checkpoint-<timestamp>"
During_Command_Chains:
Intermediate_Results: "Save results after each successful step"
Context_Handoff: "Pass validated context to next command"
Rollback_Points: "Mark safe restoration points"
Failure_Recovery:
Partial_Completion: "Preserve completed work"
State_Analysis: "Determine safe rollback point"
User_Options: "Present recovery choices"
Context_Resilience:
Session_State:
Persistent_Storage: "Maintain state across interruptions"
Auto_Save: "Periodic context snapshots"
Recovery: "Restore from last known good state"
Command_Chain_Recovery:
Failed_Step_Isolation: "Don't lose previous successful steps"
Alternative_Paths: "Suggest different approaches for failed step"
Partial_Results: "Use completed work in recovery strategy"
```
## Proactive Error Prevention
```yaml
Pre_Execution_Validation:
Environment_Check:
Required_Tools: "Verify dependencies before starting"
Permissions: "Check access rights for planned operations"
Disk_Space: "Ensure adequate space for outputs"
Network: "Verify connectivity for remote operations"
Conflict_Detection:
File_Locks: "Check for locked files before editing"
Git_State: "Verify clean working tree for git ops"
Process_Conflicts: "Detect conflicting background processes"
Resource_Availability:
Memory_Usage: "Ensure adequate RAM for large operations"
CPU_Load: "Warn if system under heavy load"
Token_Budget: "Estimate token usage vs available quota"
Risk_Assessment:
Operation_Scoring:
Data_Loss_Risk: "1-10 scale based on destructiveness"
Reversibility: "Can operation be undone?"
Scope_Impact: "How many files/systems affected?"
Mitigation_Requirements:
High_Risk: "Require explicit confirmation + backup"
Medium_Risk: "Warn user + create checkpoint"
Low_Risk: "Proceed w/ monitoring"
```
## Enhanced Error Reporting
```yaml
Intelligent_Error_Messages:
Root_Cause_Analysis:
Technical_Details: "Specific error codes, stack traces"
User_Context: "What user was trying to accomplish"
Environmental_Factors: "System state, recent changes"
Actionable_Guidance:
Immediate_Steps: "What user can do right now"
Alternative_Approaches: "Different ways to achieve goal"
Prevention: "How to avoid this error in future"
Context_Preservation:
Session_Info: "Command history, current state"
Relevant_Files: "Which files were being processed"
System_State: "Git status, dependency versions"
Error_Learning:
Pattern_Recognition:
Frequent_Issues: "Track commonly occurring errors"
User_Patterns: "Learn user-specific failure modes"
System_Patterns: "Identify environment-specific issues"
Adaptive_Responses:
Personalized_Suggestions: "Based on user's history"
Proactive_Warnings: "Predict likely issues"
Automated_Fixes: "Apply known solutions automatically"
```
## Implementation Integration
```yaml
Command_Wrapper_Enhancement:
Error_Boundary:
Catch_All_Errors: "Wrap every operation in try/catch"
Classify_Error: "Determine error type & appropriate response"
Apply_Strategy: "Retry, failover, or graceful degradation"
Context_Management:
Save_State: "Before each significant operation"
Track_Progress: "Monitor completion of multi-step processes"
Restore_State: "On failure, return to last good state"
Recovery_Commands:
Manual_Recovery: "/user:recover --from-checkpoint"
Status_Check: "/user:recovery-status"
Clear_State: "/user:recovery-clear"
Integration_Points:
All_Commands: "Enhanced error handling built into every command"
MCP_Servers: "Automatic failover & circuit breaker patterns"
User_Experience: "Seamless recovery w/ minimal interruption"
```
## Usage Examples
```yaml
Network_Failure_Scenario:
Error: "Context7 server timeout during docs lookup"
Recovery: "Auto-fallback to WebSearch → local cache"
User_Experience: "⚠ Using cached docs (Context7 unavailable)"
File_Lock_Scenario:
Error: "Cannot edit file (locked by another process)"
Recovery: "Wait 5s → retry → suggest alternatives"
User_Experience: "Retrying in 5s... or try manual edit"
Command_Chain_Failure:
Error: "Step 3 of 5 fails in build workflow"
Recovery: "Preserve steps 1-2, suggest alternatives for 3"
User_Experience: "Build partially complete. Alternative approaches: ..."
```
---
*Enhanced Error Recovery v1.0 - Intelligent resilience for SuperClaude operations*