Files
SuperClaude/.claude/commands/shared/error-recovery.yml
NomenAK bce31d52a8 Initial commit: SuperClaude v4.0.0 configuration framework
- Core configuration files (CLAUDE.md, RULES.md, PERSONAS.md, MCP.md)
- 17 slash commands for specialized workflows
- 25 shared YAML resources for advanced configurations
- Installation script for global deployment
- 9 cognitive personas for specialized thinking modes
- MCP integration patterns for intelligent tool usage
- Token economy and ultracompressed mode support

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-22 14:02:49 +02:00

173 lines
5.9 KiB
YAML

# Error Recovery & Resilience Patterns
## Error Classification & Response
```yaml
Error Severity Levels:
CRITICAL [10]: Data loss, security breach, prod down
Response: Immediate stop, alert, rollback, incident response
Recovery: Manual intervention required, full investigation
HIGH [7-9]: Build failure, test failure, deployment issues
Response: Stop workflow, notify user, suggest fixes
Recovery: Automated retry w/ backoff, alternative paths
MEDIUM [4-6]: Warning conditions, perf degradation
Response: Continue w/ warning, log for later review
Recovery: Attempt optimization, monitor for escalation
LOW [1-3]: Info messages, style violations, minor optimizations
Response: Note in output, continue execution
Recovery: Background fixes, cleanup on completion
Error Categories:
Transient: Network timeouts, temp resource unavailability
Strategy: Exponential backoff retry, circuit breaker pattern
Config: Missing env vars, incorrect settings, permissions
Strategy: Validation, helpful error msgs, setup guidance
Logic: Code bugs, algorithm errors, edge cases
Strategy: Fallback implementations, graceful degradation
Resource: Out of memory, disk space, API rate limits
Strategy: Resource monitoring, cleanup, queue management
```
## Recovery Strategies
```yaml
Automatic Recovery:
Retry Mechanisms:
Simple: Up to 3 attempts with 1s delay
Exponential: 1s, 2s, 4s, 8s delays with jitter
Circuit Breaker: Stop retries after threshold failures
Fallback Patterns:
Alternative Commands: Use native tools if MCP fails
Degraded Functionality: Skip non-essential features
Cached Results: Use previous successful outputs
State Management:
Checkpoints: Save state before risky operations
Rollback: Automatic revert to last known good state
Cleanup: Remove partial results on failure
Manual Recovery:
User Guidance:
Clear Error Messages: What failed, why, how to fix
Action Items: Specific steps user can take
Documentation Links: Relevant help resources
Intervention Points:
Confirmation: Ask before destructive operations
Override: Allow user to skip validation warnings
Custom: Accept user-provided alternative approaches
Recovery Tools:
Diagnostic: Commands to investigate failures
Repair: Automated fixes for common issues
Reset: Return to clean state for fresh start
```
## Command-Specific Recovery
```yaml
Build Failures:
Clean & Retry: Remove artifacts, clear cache, rebuild
Dependency Issues: Update lockfiles, reinstall packages
Compilation Errors: Suggest fixes, alternative approaches
Test Failures:
Flaky Tests: Retry failed tests, identify unstable tests
Environment Issues: Reset test state, check prerequisites
Coverage Gaps: Generate missing tests, update thresholds
Deploy Failures:
Health Check Failures: Rollback, investigate logs
Resource Constraints: Scale up, optimize deployment
Configuration Issues: Validate settings, check secrets
Analysis Failures:
Tool Unavailable: Fallback to alternative analysis tools
Large Codebase: Reduce scope, batch processing
Permission Issues: Guide user through access setup
MCP Server Failures:
Connection Issues: Retry with exponential backoff
Timeout: Reduce request complexity, use native tools
Rate Limiting: Queue requests, implement backoff
Service Unavailable: Fallback to cached results or native tools
```
## Error Monitoring & Learning
```yaml
Error Tracking:
Frequency: Count occurrence of specific error types
Patterns: Identify common failure sequences
Trends: Monitor error rates over time
Context: Track environmental factors during failures
Failure Analysis:
Root Cause: Automated analysis of failure chains
Prevention: Suggest preventive measures
Optimization: Identify improvement opportunities
Documentation: Update guides based on common issues
Learning System:
Success Patterns: Learn from successful recovery strategies
User Preferences: Remember user's preferred recovery methods
Environment Adaptation: Adjust strategies based on project context
Continuous Improvement: Update recovery logic based on outcomes
```
## Integration with Commands
```yaml
Pre-Execution Validation:
Prerequisites: Check required tools, permissions, resources
Environment: Validate configuration, network connectivity
State: Ensure clean starting state, no conflicts
During Execution:
Monitoring: Track progress, resource usage, early warnings
Checkpointing: Save state at critical milestones
Health Checks: Validate system state during long operations
Post-Execution:
Verification: Confirm expected outcomes achieved
Cleanup: Remove temporary files, release resources
Reporting: Document successes, failures, lessons learned
Error Reporting Format:
Structured: Consistent error message format across commands
Actionable: Include specific steps for resolution
Contextual: Provide relevant system and environment information
Traceable: Include operation ID for troubleshooting
```
## Configuration & Customization
```yaml
User Preferences:
Recovery Style: Conservative (safe) vs Aggressive (fast)
Retry Limits: Maximum attempts for different error types
Notification: How and when to alert user of issues
Automation Level: How much recovery to attempt automatically
Project Settings:
Critical Operations: Commands that require extra safety
Acceptable Risk: Tolerance for failures in development vs production
Resource Limits: Maximum time, memory, network usage
Dependencies: Critical external services that must be available
Environment Adaptation:
Development: More aggressive retries, helpful error messages
Staging: Balanced approach, thorough logging
Production: Conservative recovery, immediate alerting
CI/CD: Fast failure, detailed diagnostic information
```
---
*Error recovery: Resilient command execution with intelligent failure handling*