SuperClaude/superclaude/mcp/MCP_Tavily.md

285 lines
6.9 KiB
Markdown
Raw Normal View History

feat: Add Deep Research System v4.2.0 (#380) feat: Add Deep Research System v4.2.0 - Autonomous web research capabilities ## Overview Comprehensive implementation of Deep Research framework aligned with DR Agent architecture, enabling autonomous, adaptive, and intelligent web research capabilities. ## Key Features ### 🔬 Deep Research Agent - 15th specialized agent for comprehensive research orchestration - Adaptive planning strategies: Planning-Only, Intent-Planning, Unified Intent-Planning - Multi-hop reasoning with genealogy tracking (up to 5 hops) - Self-reflective mechanisms with confidence scoring (0.0-1.0) - Case-based learning for cross-session intelligence ### 🎯 New /sc:research Command - Intelligent web research with depth control (quick/standard/deep/exhaustive) - Parallel-first execution for optimal performance - Domain filtering and time-based search options - Automatic report generation in claudedocs/ ### 🔍 Tavily MCP Integration - 7th MCP server for real-time web search - News search with time filtering - Content extraction from search results - Multi-round searching with iterative refinement - Free tier available with optional API key ### 🎨 MODE_DeepResearch - 7th behavioral mode for systematic investigation - 6-phase workflow: Understand → Plan → TodoWrite → Execute → Track → Validate - Evidence-based reasoning with citation management - Parallel operation defaults for efficiency ## Technical Changes ### Framework Updates - Updated agent count: 14 → 15 agents - Updated mode count: 6 → 7 modes - Updated MCP server count: 6 → 7 servers - Updated command count: 24 → 25 commands ### Configuration - Added RESEARCH_CONFIG.md for research settings - Added deep_research_workflows.md with examples - Standardized file naming conventions (UPPERCASE for Core) - Removed multi-source investigation features for simplification ### Integration Points - Enhanced MCP component with remote server support - Added check_research_prerequisites() in environment.py - Created verify_research_integration.sh script - Updated all documentation guides ## Requirements - TAVILY_API_KEY environment variable (free tier available) - Node.js and npm for Tavily MCP execution ## Documentation - Complete user guide integration - Workflow examples and best practices - API configuration instructions - Depth level explanations 🤖 Generated with Claude Code Co-authored-by: moshe_anconina <moshe_a@ituran.com> Co-authored-by: Claude <noreply@anthropic.com>
2025-09-21 04:54:42 +03:00
# Tavily MCP Server
**Purpose**: Web search and real-time information retrieval for research and current events
## Triggers
- Web search requirements beyond Claude's knowledge cutoff
- Current events, news, and real-time information needs
- Market research and competitive analysis tasks
- Technical documentation not in training data
- Academic research requiring recent publications
- Fact-checking and verification needs
- Deep research investigations requiring multi-source analysis
- `/sc:research` command activation
## Choose When
- **Over WebSearch**: When you need structured search with advanced filtering
- **Over WebFetch**: When you need multi-source search, not single page extraction
- **For research**: Comprehensive investigations requiring multiple sources
- **For current info**: Events, updates, or changes after knowledge cutoff
- **Not for**: Simple questions answerable from training, code generation, local file operations
## Works Best With
- **Sequential**: Tavily provides raw information → Sequential analyzes and synthesizes
- **Playwright**: Tavily discovers URLs → Playwright extracts complex content
- **Context7**: Tavily searches for updates → Context7 provides stable documentation
- **Serena**: Tavily performs searches → Serena stores research sessions
## Configuration
Requires TAVILY_API_KEY environment variable from https://app.tavily.com
## Search Capabilities
- **Web Search**: General web searches with ranking algorithms
- **News Search**: Time-filtered news and current events
- **Academic Search**: Scholarly articles and research papers
- **Domain Filtering**: Include/exclude specific domains
- **Content Extraction**: Full-text extraction from search results
- **Freshness Control**: Prioritize recent content
- **Multi-Round Searching**: Iterative refinement based on gaps
## Examples
```
"latest TypeScript features 2024" → Tavily (current technical information)
"OpenAI GPT updates this week" → Tavily (recent news and updates)
"quantum computing breakthroughs 2024" → Tavily (recent research)
"best practices React Server Components" → Tavily (current best practices)
"explain recursion" → Native Claude (general concept explanation)
"write a Python function" → Native Claude (code generation)
```
## Search Patterns
### Basic Search
```
Query: "search term"
→ Returns: Ranked results with snippets
```
### Domain-Specific Search
```
Query: "search term"
Domains: ["arxiv.org", "github.com"]
→ Returns: Results from specified domains only
```
### Time-Filtered Search
```
Query: "search term"
Recency: "week" | "month" | "year"
→ Returns: Recent results within timeframe
```
### Deep Content Search
```
Query: "search term"
Extract: true
→ Returns: Full content extraction from top results
```
## Quality Optimization
- **Query Refinement**: Iterate searches based on initial results
- **Source Diversity**: Ensure multiple perspectives in results
- **Credibility Filtering**: Prioritize authoritative sources
- **Deduplication**: Remove redundant information across sources
- **Relevance Scoring**: Focus on most pertinent results
## Integration Flows
### Research Flow
```
1. Tavily: Initial broad search
2. Sequential: Analyze and identify gaps
3. Tavily: Targeted follow-up searches
4. Sequential: Synthesize findings
5. Serena: Store research session
```
### Fact-Checking Flow
```
1. Tavily: Search for claim verification
2. Tavily: Find contradicting sources
3. Sequential: Analyze evidence
4. Report: Present balanced findings
```
### Competitive Analysis Flow
```
1. Tavily: Search competitor information
2. Tavily: Search market trends
3. Sequential: Comparative analysis
4. Context7: Technical comparisons
5. Report: Strategic insights
```
### Deep Research Flow (DR Agent)
```
1. Planning: Decompose research question
2. Tavily: Execute planned searches
3. Analysis: Assess URL complexity
4. Routing: Simple → Tavily extract | Complex → Playwright
5. Synthesis: Combine all sources
6. Iteration: Refine based on gaps
```
## Advanced Search Strategies
### Multi-Hop Research
```yaml
Initial_Search:
query: "core topic"
depth: broad
Follow_Up_1:
query: "entities from initial"
depth: targeted
Follow_Up_2:
query: "relationships discovered"
depth: deep
Synthesis:
combine: all_findings
resolve: contradictions
```
### Adaptive Query Generation
```yaml
Simple_Query:
- Direct search terms
- Single concept focus
Complex_Query:
- Multiple search variations
- Boolean operators
- Domain restrictions
- Time filters
Iterative_Query:
- Start broad
- Refine based on results
- Target specific gaps
```
### Source Credibility Assessment
```yaml
High_Credibility:
- Academic institutions
- Government sources
- Established media
- Official documentation
Medium_Credibility:
- Industry publications
- Expert blogs
- Community resources
Low_Credibility:
- User forums
- Social media
- Unverified sources
```
## Performance Considerations
### Search Optimization
- Batch similar searches together
- Cache search results for reuse
- Prioritize high-value sources
- Limit depth based on confidence
### Rate Limiting
- Maximum searches per minute
- Token usage per search
- Result caching duration
- Parallel search limits
### Cost Management
- Monitor API usage
- Set budget limits
- Optimize query efficiency
- Use caching effectively
## Integration with DR Agent Architecture
### Planning Strategy Support
```yaml
Planning_Only:
- Direct query execution
- No refinement needed
Intent_Planning:
- Clarify search intent
- Generate focused queries
Unified:
- Present search plan
- Adjust based on feedback
```
### Multi-Hop Execution
```yaml
Hop_Management:
- Track search genealogy
- Build on previous results
- Detect circular references
- Maintain hop context
```
### Self-Reflection Integration
```yaml
Quality_Check:
- Assess result relevance
- Identify coverage gaps
- Trigger additional searches
- Calculate confidence scores
```
### Case-Based Learning
```yaml
Pattern_Storage:
- Successful query formulations
- Effective search strategies
- Domain preferences
- Time filter patterns
```
## Error Handling
### Common Issues
- API key not configured
- Rate limit exceeded
- Network timeout
- No results found
- Invalid query format
### Fallback Strategies
- Use native WebSearch
- Try alternative queries
- Expand search scope
- Use cached results
- Simplify search terms
## Best Practices
### Query Formulation
1. Start with clear, specific terms
2. Use quotes for exact phrases
3. Include relevant keywords
4. Specify time ranges when needed
5. Use domain filters strategically
### Result Processing
1. Verify source credibility
2. Cross-reference multiple sources
3. Check publication dates
4. Identify potential biases
5. Extract key information
### Integration Workflow
1. Plan search strategy
2. Execute initial searches
3. Analyze results
4. Identify gaps
5. Refine and iterate
6. Synthesize findings
7. Store valuable patterns