AI Agent Factory with Claude Code Subagents

2025-12-29 16:14:56 +00:00 · 2025-08-22 21:01:17 -05:00
parent 4e1240a0b3
commit 8d9f46ecfa
104 changed files with 24521 additions and 0 deletions
--- a/use-cases/agent-factory-with-subagents/agents/rag_agent/planning/INITIAL.md
+++ b/use-cases/agent-factory-with-subagents/agents/rag_agent/planning/INITIAL.md
@@ -0,0 +1,147 @@
+# Agent Requirements: Semantic Search Agent
+
+## Executive Summary
+A simple yet powerful semantic search agent that leverages PGVector to provide intelligent document retrieval and summarized insights. The agent automatically chooses between semantic and hybrid search while maintaining a clean CLI interface for user interactions.
+
+## Agent Classification
+- **Type**: Tool-Enabled Agent with structured output capabilities
+- **Complexity**: Medium
+- **Priority Features**: 
+  1. Semantic search with embeddings
+  2. Intelligent search type selection
+  3. Search result summarization
+
+## Functional Requirements
+
+### Core Functionality
+1. **Semantic Search Operation**
+   - Execute semantic similarity search using PGVector embeddings
+   - Automatically generate query embeddings using OpenAI text-embedding-3-small (1536 dimensions)
+   - Return top-k relevant document chunks with similarity scores
+   - **Acceptance Criteria**: Successfully retrieve and rank documents by semantic similarity
+
+2. **Hybrid Search with Auto-Selection**
+   - Automatically determine when to use semantic vs hybrid search based on query characteristics
+   - Allow manual override when user explicitly specifies search type
+   - Combine vector similarity with full-text search for enhanced results
+   - **Acceptance Criteria**: Intelligently route queries to optimal search method
+
+3. **Search Result Summarization**
+   - Analyze retrieved chunks and generate concise insights
+   - Synthesize information from multiple sources into coherent summaries
+   - Maintain source attribution for transparency
+   - **Acceptance Criteria**: Provide meaningful summaries with proper source references
+
+### Input/Output Specifications
+- **Input Types**: 
+  - Natural language queries via CLI
+  - Optional search type specification ("semantic", "hybrid", or auto-detect)
+  - Optional result limit (default: 10)
+- **Output Format**: String responses with structured summaries and source citations
+- **Validation Requirements**: Query length validation, result limit bounds (1-50)
+
+## Technical Requirements
+
+### Model Configuration
+- **Primary Model**: openai:gpt-4o-mini (cost-effective for summarization tasks)
+- **Embedding Model**: text-embedding-3-small (1536 dimensions, matches database schema)
+- **Context Window Needs**: ~8K tokens for processing multiple search results
+
+### External Integrations
+1. **PostgreSQL with PGVector**:
+   - Purpose: Semantic similarity search and hybrid search operations
+   - Authentication: DATABASE_URL environment variable
+   - Functions used: `match_chunks()` and `hybrid_search()`
+   - Connection: asyncpg with connection pooling
+
+2. **OpenAI Embeddings API**:
+   - Purpose: Generate query embeddings for semantic search
+   - Authentication: OPENAI_API_KEY environment variable
+   - Model: text-embedding-3-small
+
+### Tool Requirements
+1. **semantic_search**:
+   - Purpose: Execute pure semantic similarity search using embeddings
+   - Parameters: query (str), limit (int, default=10)
+   - Error Handling: Database connection retry, empty result handling
+
+2. **hybrid_search**:
+   - Purpose: Execute combined semantic + keyword search
+   - Parameters: query (str), limit (int, default=10), text_weight (float, default=0.3)
+   - Error Handling: Fallback to semantic search if text search fails
+
+3. **auto_search**:
+   - Purpose: Automatically select search type based on query analysis
+   - Parameters: query (str), limit (int, default=10)
+   - Error Handling: Default to semantic search on classification failure
+
+## Dependencies and Environment
+
+### API Keys and Credentials
+- DATABASE_URL: PostgreSQL connection string with PGVector extension
+- OPENAI_API_KEY: OpenAI API key for embeddings and LLM
+- LLM_MODEL: Model name (default: gpt-4o-mini)
+
+### Python Packages
+- pydantic-ai (core framework)
+- asyncpg (PostgreSQL async driver)
+- python-dotenv (environment variable management)
+- rich (CLI formatting)
+- openai (embeddings API)
+- numpy (embedding vector operations)
+
+### System Requirements
+- Python version: 3.11+
+- PostgreSQL with PGVector extension
+- Memory requirements: ~256MB for embeddings cache
+- Network requirements: Internet access for OpenAI API
+
+## Success Criteria
+1. **Search Accuracy**: Retrieve semantically relevant results with >0.7 similarity threshold
+2. **Response Time**: Complete search and summarization within 3-5 seconds
+3. **Auto-Selection Accuracy**: Correctly choose search type in >80% of cases
+4. **Summary Quality**: Generate coherent summaries that capture key insights from search results
+
+## Security and Compliance
+- **Data Privacy**: Database queries and results handled securely, no data logging
+- **API Key Management**: Environment variables only, no hardcoded credentials
+- **Input Sanitization**: Query length limits, SQL injection prevention via parameterized queries
+- **Audit Logging**: Search queries and result counts logged for performance monitoring
+
+## Testing Requirements
+- **Unit Tests**: Individual tool functions, search type classification logic
+- **Integration Tests**: End-to-end database connectivity and search operations
+- **Performance Tests**: Search response times under different query types and database sizes
+- **Security Tests**: Input validation, SQL injection prevention, API key security
+
+## Constraints and Limitations
+- **Database Schema**: Must work with existing documents/chunks tables and PGVector functions
+- **Embedding Dimensions**: Fixed at 1536 to match existing database schema
+- **Search Result Limit**: Maximum 50 results to prevent performance issues
+- **Query Length**: Maximum 1000 characters to prevent embedding API limits
+
+## Future Enhancements (Optional)
+- Search result caching for frequently asked questions
+- Advanced query preprocessing (entity extraction, query expansion)
+- Multi-language search support
+- Search analytics and result ranking improvements
+- Integration with document ingestion pipeline
+
+## Assumptions Made
+1. **Database Setup**: PGVector extension is properly installed and configured
+2. **Existing Data**: Documents and chunks tables are populated with embedded content
+3. **Search Patterns**: Users will primarily perform knowledge-based queries
+4. **Performance**: Database has appropriate indexes for efficient vector operations
+5. **API Access**: Stable internet connection for OpenAI API calls
+6. **CLI Usage**: Primary interface will be command-line with rich formatting
+
+## Approval Checklist
+- [x] All core requirements defined (semantic search, auto-selection, summarization)
+- [x] External dependencies identified (PostgreSQL/PGVector, OpenAI)
+- [x] Security considerations addressed (env vars, input validation)
+- [x] Testing strategy outlined (unit, integration, performance)
+- [x] Success criteria measurable (accuracy, response time, auto-selection)
+
+---
+Generated: 2025-08-22
+Status: Ready for Component Development
--- a/use-cases/agent-factory-with-subagents/agents/rag_agent/planning/dependencies.md
+++ b/use-cases/agent-factory-with-subagents/agents/rag_agent/planning/dependencies.md
@@ -0,0 +1,351 @@
+# Semantic Search Agent - Dependency Configuration
+
+## Executive Summary
+Minimal dependency configuration for a semantic search agent that connects to PostgreSQL with PGVector extension and uses OpenAI for embeddings and LLM operations. Focus on simplicity with essential environment variables and core Python packages.
+
+## Environment Variables Configuration
+
+### Essential Environment Variables (.env.example)
+```bash
+# LLM Configuration (REQUIRED)
+LLM_PROVIDER=openai
+OPENAI_API_KEY=your-openai-api-key-here
+LLM_MODEL=gpt-4o-mini
+LLM_BASE_URL=https://api.openai.com/v1
+
+# Database Configuration (REQUIRED)
+DATABASE_URL=postgresql://username:password@localhost:5432/semantic_search_db
+
+# Application Settings
+APP_ENV=development
+LOG_LEVEL=INFO
+DEBUG=false
+MAX_RETRIES=3
+TIMEOUT_SECONDS=30
+
+# Search Configuration
+DEFAULT_SEARCH_LIMIT=10
+MAX_SEARCH_LIMIT=50
+SIMILARITY_THRESHOLD=0.7
+EMBEDDING_MODEL=text-embedding-3-small
+EMBEDDING_DIMENSIONS=1536
+
+# Connection Pooling
+DB_POOL_MIN_SIZE=5
+DB_POOL_MAX_SIZE=20
+DB_TIMEOUT=30
+```
+
+### Environment Variable Validation
+- **OPENAI_API_KEY**: Required, must not be empty
+- **DATABASE_URL**: Required, must be valid PostgreSQL connection string
+- **LLM_MODEL**: Default to "gpt-4o-mini" if not specified
+- **EMBEDDING_MODEL**: Default to "text-embedding-3-small"
+- **DEFAULT_SEARCH_LIMIT**: Integer between 1-50, default 10
+
+## Settings Configuration (settings.py)
+
+### BaseSettings Class Structure
+```python
+class Settings(BaseSettings):
+    """Application settings with environment variable support."""
+    
+    model_config = ConfigDict(
+        env_file=".env",
+        env_file_encoding="utf-8", 
+        case_sensitive=False,
+        extra="ignore"
+    )
+    
+    # LLM Configuration
+    llm_provider: str = Field(default="openai")
+    openai_api_key: str = Field(..., description="OpenAI API key")
+    llm_model: str = Field(default="gpt-4o-mini")
+    llm_base_url: str = Field(default="https://api.openai.com/v1")
+    
+    # Database Configuration  
+    database_url: str = Field(..., description="PostgreSQL connection URL")
+    db_pool_min_size: int = Field(default=5)
+    db_pool_max_size: int = Field(default=20)
+    db_timeout: int = Field(default=30)
+    
+    # Search Configuration
+    embedding_model: str = Field(default="text-embedding-3-small")
+    embedding_dimensions: int = Field(default=1536)
+    default_search_limit: int = Field(default=10)
+    max_search_limit: int = Field(default=50)
+    similarity_threshold: float = Field(default=0.7)
+    
+    # Application Settings
+    app_env: str = Field(default="development")
+    log_level: str = Field(default="INFO")
+    debug: bool = Field(default=False)
+    max_retries: int = Field(default=3)
+    timeout_seconds: int = Field(default=30)
+```
+
+## Model Provider Configuration (providers.py)
+
+### Simple OpenAI Provider Setup
+```python
+def get_llm_model():
+    """Get OpenAI model configuration."""
+    settings = load_settings()
+    
+    provider = OpenAIProvider(
+        base_url=settings.llm_base_url,
+        api_key=settings.openai_api_key
+    )
+    
+    return OpenAIModel(settings.llm_model, provider=provider)
+
+def get_embedding_client():
+    """Get OpenAI client for embeddings."""
+    settings = load_settings()
+    return OpenAI(api_key=settings.openai_api_key)
+```
+
+## Agent Dependencies (dependencies.py)
+
+### Simple Dataclass Structure
+```python
+@dataclass
+class SemanticSearchDependencies:
+    """Dependencies for semantic search agent."""
+    
+    # Database connection
+    db_pool: Optional[asyncpg.Pool] = None
+    
+    # OpenAI client for embeddings
+    openai_client: Optional[OpenAI] = None
+    
+    # Configuration
+    embedding_model: str = "text-embedding-3-small"
+    embedding_dimensions: int = 1536
+    default_limit: int = 10
+    max_limit: int = 50
+    similarity_threshold: float = 0.7
+    
+    # Runtime context
+    session_id: Optional[str] = None
+    user_id: Optional[str] = None
+    debug: bool = False
+    
+    @classmethod
+    async def create(cls, settings: Settings, **overrides):
+        """Create dependencies with initialized connections."""
+        
+        # Initialize database pool
+        db_pool = await asyncpg.create_pool(
+            settings.database_url,
+            min_size=settings.db_pool_min_size,
+            max_size=settings.db_pool_max_size,
+            timeout=settings.db_timeout
+        )
+        
+        # Initialize OpenAI client
+        openai_client = OpenAI(api_key=settings.openai_api_key)
+        
+        return cls(
+            db_pool=db_pool,
+            openai_client=openai_client,
+            embedding_model=settings.embedding_model,
+            embedding_dimensions=settings.embedding_dimensions,
+            default_limit=settings.default_search_limit,
+            max_limit=settings.max_search_limit,
+            similarity_threshold=settings.similarity_threshold,
+            debug=settings.debug,
+            **overrides
+        )
+    
+    async def cleanup(self):
+        """Cleanup database connections."""
+        if self.db_pool:
+            await self.db_pool.close()
+```
+
+## Python Package Requirements
+
+### Core Dependencies (requirements.txt)
+```txt
+# Pydantic AI Framework
+pydantic-ai>=0.1.0
+pydantic>=2.0.0
+pydantic-settings>=2.0.0
+
+# Environment Management
+python-dotenv>=1.0.0
+
+# OpenAI Integration
+openai>=1.0.0
+
+# Database
+asyncpg>=0.28.0
+
+# CLI and Utilities
+rich>=13.0.0
+click>=8.1.0
+
+# Vector Operations
+numpy>=1.24.0
+
+# Async Support
+httpx>=0.25.0
+aiofiles>=23.0.0
+
+# Development and Testing
+pytest>=7.4.0
+pytest-asyncio>=0.21.0
+black>=23.0.0
+ruff>=0.1.0
+```
+
+### Optional Performance Dependencies
+```txt
+# Enhanced Performance (optional)
+uvloop>=0.19.0  # Faster async event loop on Unix
+orjson>=3.9.0   # Faster JSON processing
+```
+
+## Database Connection Management
+
+### Connection Pool Configuration
+- **Minimum Pool Size**: 5 connections for baseline availability
+- **Maximum Pool Size**: 20 connections to handle concurrent requests
+- **Connection Timeout**: 30 seconds for robustness
+- **Query Timeout**: 30 seconds for search operations
+- **Retry Logic**: 3 attempts with exponential backoff
+
+### Required Database Schema
+```sql
+-- Ensure PGVector extension is enabled
+CREATE EXTENSION IF NOT EXISTS vector;
+
+-- Expected table structure (not created by agent)
+-- chunks table with embedding column (1536 dimensions)
+-- match_chunks() and hybrid_search() functions available
+```
+
+## Security Configuration
+
+### API Key Management
+- Store all secrets in `.env` file (never committed)
+- Validate API keys on startup
+- Use environment variable validation
+- Implement key rotation support for production
+
+### Database Security
+- Use parameterized queries only
+- Enable SSL connections in production
+- Implement connection pooling limits
+- Log connection attempts for monitoring
+
+### Input Validation
+- Query length limits (max 1000 characters)
+- Search result limits (1-50 range)
+- Embedding dimension validation
+- SQL injection prevention
+
+## Error Handling Patterns
+
+### Database Connection Errors
+```python
+# Retry logic with exponential backoff
+max_retries = 3
+base_delay = 1.0
+```
+
+### OpenAI API Errors
+```python
+# Handle rate limiting, API errors
+# Fallback to cached embeddings when possible
+```
+
+### Search Operation Errors
+```python
+# Graceful degradation from hybrid to semantic search
+# Empty result handling
+# Timeout handling
+```
+
+## Testing Configuration
+
+### Test Dependencies Structure
+```python
+@dataclass 
+class TestDependencies:
+    """Simplified dependencies for testing."""
+    
+    # Mock database operations
+    mock_db_results: List[dict] = field(default_factory=list)
+    
+    # Mock embedding responses
+    mock_embeddings: List[List[float]] = field(default_factory=list)
+    
+    # Test configuration
+    debug: bool = True
+    default_limit: int = 5
+```
+
+### Test Environment Variables
+```bash
+# Test-specific overrides
+DATABASE_URL=postgresql://test:test@localhost:5432/test_db
+OPENAI_API_KEY=test-key-for-mock-responses
+LLM_MODEL=gpt-4o-mini
+DEBUG=true
+LOG_LEVEL=DEBUG
+```
+
+## Performance Considerations
+
+### Connection Pooling
+- Database pool sized for expected concurrent users
+- Connection reuse to minimize overhead
+- Proper cleanup to prevent resource leaks
+
+### Embedding Operations
+- Cache frequently used embeddings
+- Batch embedding generation when possible
+- Use appropriate embedding model for cost/performance balance
+
+### Memory Management
+- Limit search result sizes
+- Stream large responses when needed
+- Clean up temporary objects
+
+## Production Deployment
+
+### Environment-Specific Settings
+- **Development**: Debug enabled, verbose logging
+- **Production**: Connection pooling optimized, minimal logging
+- **Testing**: Mock connections, isolated database
+
+### Monitoring and Logging
+- Connection pool metrics
+- Search operation timing
+- API call tracking
+- Error rate monitoring
+
+## Quality Checklist
+
+- [x] Essential environment variables defined
+- [x] Single model provider (OpenAI) configured
+- [x] Simple dataclass dependencies structure
+- [x] Minimal Python packages identified
+- [x] Database connection pooling specified
+- [x] Security measures outlined
+- [x] Error handling patterns defined
+- [x] Testing configuration provided
+- [x] Performance considerations addressed
+- [x] Production deployment guidelines included
+
+## Dependencies Summary
+
+**Total Python Packages**: 12 core + 4 development
+**Environment Variables**: 15 total (5 required)  
+**External Services**: 2 (PostgreSQL + PGVector, OpenAI API)
+**Configuration Complexity**: Low - Single model provider, simple dataclass
+**Initialization Time**: ~2-3 seconds for database pool + OpenAI client
+
+This minimal dependency configuration provides all essential functionality while maintaining simplicity and avoiding over-engineering. The focus is on the core semantic search capabilities with proper database connection management and OpenAI integration.
--- a/use-cases/agent-factory-with-subagents/agents/rag_agent/planning/prompts.md
+++ b/use-cases/agent-factory-with-subagents/agents/rag_agent/planning/prompts.md
@@ -0,0 +1,164 @@
+# System Prompts for Semantic Search Agent
+
+## Primary System Prompt
+
+```python
+SYSTEM_PROMPT = """
+You are an expert knowledge retrieval assistant specializing in semantic search and intelligent information synthesis. Your primary purpose is to help users find relevant information from a knowledge base and provide clear, actionable insights.
+
+Core Competencies:
+1. Semantic similarity search using vector embeddings
+2. Intelligent search strategy selection (semantic vs hybrid)
+3. Information synthesis and coherent summarization
+4. Source attribution and transparency
+
+Your Approach:
+- Automatically analyze queries to determine the optimal search strategy
+- Use semantic search for conceptual queries and hybrid search for specific facts or names
+- Retrieve relevant document chunks with similarity scoring
+- Synthesize information from multiple sources into coherent, well-structured summaries
+- Always provide source references for transparency and verification
+
+Available Tools:
+- auto_search: Automatically selects best search method for query
+- semantic_search: Pure vector similarity search for conceptual queries
+- hybrid_search: Combined vector + keyword search for specific information
+
+Response Guidelines:
+- Start with a brief summary of key findings
+- Organize information logically with clear sections
+- Include relevant quotes or excerpts when helpful  
+- End with source citations showing similarity scores
+- If results are limited, acknowledge gaps and suggest refinements
+
+Query Analysis:
+- Conceptual queries (how, why, explain): Use semantic search
+- Specific facts (who, when, what exactly): Use hybrid search
+- Ambiguous queries: Default to auto_search for intelligent routing
+- Always respect the requested result limit (1-50 documents)
+
+Constraints:
+- Never fabricate information not found in search results
+- Acknowledge when information is incomplete or uncertain
+- Maintain user privacy - do not log or retain query details
+- Stay within context limits by prioritizing most relevant results
+"""
+```
+
+## Dynamic Prompt Components (if applicable)
+
+```python
+# Context-aware prompt for search session management
+@agent.system_prompt
+async def get_search_context(ctx: RunContext[AgentDependencies]) -> str:
+    """Generate context-aware instructions based on search session state."""
+    context_parts = []
+    
+    if ctx.deps.search_session_id:
+        context_parts.append(f"Search session: {ctx.deps.search_session_id}")
+    
+    if ctx.deps.user_preferences:
+        if ctx.deps.user_preferences.get("detailed_sources"):
+            context_parts.append("User prefers detailed source information and citations.")
+        if ctx.deps.user_preferences.get("concise_summaries"):
+            context_parts.append("User prefers concise, bullet-point summaries.")
+    
+    if ctx.deps.previous_queries:
+        context_parts.append(f"Previous queries in session: {len(ctx.deps.previous_queries)}")
+        context_parts.append("Build upon previous search context when relevant.")
+    
+    return " ".join(context_parts) if context_parts else ""
+```
+
+## Prompt Variations
+
+### Minimal Mode (for token optimization)
+```python
+MINIMAL_PROMPT = """
+You are a semantic search assistant. Analyze user queries, select the best search method (semantic, hybrid, or auto), retrieve relevant documents, and provide clear summaries with source citations.
+
+Tools: auto_search, semantic_search, hybrid_search
+
+Guidelines:
+- Use semantic search for concepts, hybrid for facts
+- Synthesize findings into coherent summaries
+- Always include source references
+- Stay within result limits (1-50)
+- Never fabricate information
+"""
+```
+
+### Verbose Mode (for complex queries)
+```python
+VERBOSE_PROMPT = """
+You are an expert knowledge retrieval and analysis assistant with advanced semantic search capabilities. Your role is to intelligently navigate large knowledge bases, extract relevant information, and provide comprehensive insights to user queries.
+
+Core Expertise:
+1. Advanced Query Analysis: Automatically categorize queries by intent and information type
+2. Strategic Search Selection: Choose optimal retrieval method based on query characteristics
+3. Multi-source Synthesis: Combine information from multiple documents into coherent narratives
+4. Quality Assessment: Evaluate information relevance and reliability
+5. Clear Communication: Present complex findings in accessible, well-structured formats
+
+Search Strategy Decision Making:
+- Conceptual/Theoretical Queries → Semantic search (vector similarity)
+- Factual/Specific Queries → Hybrid search (vector + keyword)
+- Complex/Ambiguous Queries → Auto-search (intelligent routing)
+- Follow-up Questions → Consider session context and previous results
+
+Information Processing Workflow:
+1. Analyze query intent and information needs
+2. Select appropriate search strategy and execute retrieval
+3. Evaluate result relevance using similarity scores and content quality
+4. Synthesize information across sources, noting convergence and contradictions
+5. Structure response with executive summary, detailed findings, and source attribution
+6. Identify information gaps and suggest query refinements if needed
+
+Quality Standards:
+- Minimum similarity threshold of 0.7 for included results
+- Cross-reference information across multiple sources when possible
+- Clearly distinguish between confirmed facts and interpretations
+- Provide confidence indicators for synthesized insights
+- Maintain complete source traceability for verification
+"""
+```
+
+## Integration Instructions
+
+1. Import in agent.py:
+```python
+from .prompts.system_prompts import SYSTEM_PROMPT, get_search_context
+```
+
+2. Apply to agent:
+```python
+agent = Agent(
+    model,
+    system_prompt=SYSTEM_PROMPT,
+    deps_type=AgentDependencies
+)
+
+# Add dynamic prompt for search context
+agent.system_prompt(get_search_context)
+```
+
+## Prompt Optimization Notes
+
+- Token usage: ~280 tokens for primary prompt
+- Key behavioral triggers: query analysis, tool selection, summarization
+- Tested scenarios: conceptual queries, factual lookups, multi-part questions
+- Edge cases: empty results, low similarity scores, query ambiguity
+- Search strategy logic clearly defined for consistent behavior
+
+## Testing Checklist
+
+- [x] Role clearly defined as semantic search expert
+- [x] Capabilities comprehensive (search, analysis, synthesis)
+- [x] Tool usage guidance explicit
+- [x] Search strategy decision making clear
+- [x] Output format specified (summaries + citations)
+- [x] Error handling covered (empty results, low similarity)
+- [x] Quality constraints included (similarity thresholds)
+- [x] User interaction patterns defined
+- [x] Context management addressed
+- [x] Security considerations included (no data retention)
--- a/use-cases/agent-factory-with-subagents/agents/rag_agent/planning/tools.md
+++ b/use-cases/agent-factory-with-subagents/agents/rag_agent/planning/tools.md
@@ -0,0 +1,196 @@
+# Tools for Semantic Search Agent
+
+## Tool Implementation Specifications
+
+Based on the requirements from INITIAL.md, this agent needs 3 essential tools for semantic search functionality with automatic search type selection.
+
+### Tool 1: semantic_search
+
+**Purpose**: Execute semantic similarity search using PGVector embeddings  
+**Pattern**: `@agent.tool` (context-aware, needs database access)  
+**Parameters**:
+- `query` (str): The search query to find semantically similar content
+- `limit` (int, default=10): Maximum number of results to return (1-50)
+
+**Implementation Pattern**:
+```python
+@agent.tool
+async def semantic_search(
+    ctx: RunContext[AgentDependencies],
+    query: str,
+    limit: int = 10
+) -> List[Dict[str, Any]]:
+    """
+    Perform semantic similarity search using vector embeddings.
+    
+    Args:
+        query: Natural language search query
+        limit: Maximum number of results (1-50)
+    
+    Returns:
+        List of search results with content, similarity scores, and metadata
+    """
+```
+
+**Functionality**:
+- Generate query embedding using OpenAI text-embedding-3-small
+- Call `match_chunks(query_embedding, match_count)` database function
+- Return results with similarity scores above 0.7 threshold
+- Handle database connection errors with retry logic
+- Validate limit parameter (1-50 range)
+
+**Error Handling**:
+- Retry database connections up to 3 times
+- Fallback to empty results if embedding generation fails
+- Log search metrics for performance monitoring
+
+### Tool 2: hybrid_search
+
+**Purpose**: Execute combined semantic + keyword search for enhanced results  
+**Pattern**: `@agent.tool` (context-aware, needs database access)  
+**Parameters**:
+- `query` (str): The search query for both semantic and text matching
+- `limit` (int, default=10): Maximum number of results to return (1-50)
+- `text_weight` (float, default=0.3): Weight for text search component (0.0-1.0)
+
+**Implementation Pattern**:
+```python
+@agent.tool
+async def hybrid_search(
+    ctx: RunContext[AgentDependencies],
+    query: str,
+    limit: int = 10,
+    text_weight: float = 0.3
+) -> List[Dict[str, Any]]:
+    """
+    Perform hybrid search combining semantic and keyword matching.
+    
+    Args:
+        query: Search query for both vector and text search
+        limit: Maximum number of results (1-50)
+        text_weight: Weight for text search component (0.0-1.0)
+    
+    Returns:
+        List of search results with combined ranking scores
+    """
+```
+
+**Functionality**:
+- Generate query embedding for semantic component
+- Call `hybrid_search(query_embedding, query_text, match_count, text_weight)` database function
+- Combine vector similarity with full-text search results
+- Return ranked results with composite scores
+- Validate text_weight parameter (0.0-1.0 range)
+
+**Error Handling**:
+- Fallback to pure semantic search if text search component fails
+- Retry database operations with exponential backoff
+- Handle malformed query text gracefully
+
+### Tool 3: auto_search
+
+**Purpose**: Automatically select optimal search type based on query analysis  
+**Pattern**: `@agent.tool` (context-aware, orchestrates other tools)  
+**Parameters**:
+- `query` (str): The search query to analyze and execute
+- `limit` (int, default=10): Maximum number of results to return (1-50)
+
+**Implementation Pattern**:
+```python
+@agent.tool
+async def auto_search(
+    ctx: RunContext[AgentDependencies],
+    query: str,
+    limit: int = 10
+) -> Dict[str, Any]:
+    """
+    Automatically select and execute optimal search strategy.
+    
+    Args:
+        query: Natural language search query
+        limit: Maximum number of results (1-50)
+    
+    Returns:
+        Search results with metadata about search type used
+    """
+```
+
+**Functionality**:
+- Analyze query characteristics to determine optimal search type
+- Route to semantic_search for conceptual/abstract queries
+- Route to hybrid_search for queries with specific keywords or names
+- Return results with metadata indicating search method used
+- Default to semantic search if classification is uncertain
+
+**Search Type Classification Logic**:
+- **Semantic Search**: Abstract concepts, "what is", "how to", philosophical queries
+- **Hybrid Search**: Queries with proper nouns, specific terms, technical jargon
+- **Decision Factors**: Query length, presence of quotes, technical terminology
+
+**Error Handling**:
+- Default to semantic search on classification failure
+- Cascade through search types if initial method fails
+- Log decision reasoning for analytics
+
+## Utility Functions
+
+### Database Connection Management
+```python
+async def get_database_connection(ctx: RunContext[AgentDependencies]) -> asyncpg.Connection:
+    """Get database connection with retry logic."""
+```
+
+### Embedding Generation
+```python
+async def generate_embedding(ctx: RunContext[AgentDependencies], text: str) -> List[float]:
+    """Generate embedding using OpenAI API with caching."""
+```
+
+### Result Processing
+```python
+def format_search_results(results: List, search_type: str) -> Dict[str, Any]:
+    """Standardize result format across search types."""
+```
+
+## Parameter Validation
+
+All tools include validation for:
+- Query length: 1-1000 characters
+- Result limit: 1-50 results
+- Text weight: 0.0-1.0 for hybrid search
+- Non-empty string queries
+
+## Performance Considerations
+
+- **Caching**: Cache embeddings for repeated queries (5-minute TTL)
+- **Connection Pooling**: Reuse database connections across tool calls
+- **Rate Limiting**: Respect OpenAI API rate limits with retry logic
+- **Timeout Handling**: 30-second timeout for database operations
+
+## Dependencies Required
+
+```python
+from typing import Dict, Any, List, Optional
+from pydantic_ai import RunContext
+import asyncpg
+import openai
+import logging
+import asyncio
+from tenacity import retry, stop_after_attempt, wait_exponential
+```
+
+## Integration Notes
+
+- Tools work with `AgentDependencies` containing database URL and API keys
+- All tools return consistent result format for easy chaining
+- Error responses include helpful context for user feedback
+- Logging integrated for search analytics and debugging
+
+## Testing Strategy
+
+- **Unit Tests**: Individual tool parameter validation and logic
+- **Integration Tests**: End-to-end database connectivity and search operations
+- **Mock Tests**: Test with TestModel to avoid external API calls
+- **Performance Tests**: Search response times under load
+
+This tool specification provides the minimal yet complete set of functions needed for the semantic search agent, following Pydantic AI best practices with proper error handling, parameter validation, and performance optimization.