From 1576582ef571eb370bbad9bf29ab124d3f3569fe Mon Sep 17 00:00:00 2001 From: "@aaronjmars" <61592645+aaronjmars@users.noreply.github.com> Date: Sat, 28 Jun 2025 20:54:13 +0200 Subject: [PATCH] Update README.md --- README.md | 324 +++++++++++++++++++++++------------------------------- 1 file changed, 139 insertions(+), 185 deletions(-) diff --git a/README.md b/README.md index 7dac982..bc4248a 100644 --- a/README.md +++ b/README.md @@ -1,66 +1,96 @@ # OpenDia ✳️ > **The open alternative to Dia** -> Connect your browser to AI models through MCP. No browser switching neededβ€”works seamlessly with Chrome and Arc. +> Connect your browser to AI models. No browser switching neededβ€”works seamlessly with any Chromium browser including Chrome & Arc. [![npm version](https://badge.fury.io/js/opendia.svg)](https://badge.fury.io/js/opendia) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) +## πŸ“Ί See it in Action + +> **Note**: You can add images or videos here by placing them in your repository and linking them: +> ```markdown +> ![OpenDia Demo](./demo.gif) +> ``` +> or embed a video: +> ```markdown +> [![OpenDia Demo Video](https://img.youtube.com/vi/YOUR_VIDEO_ID/0.jpg)](https://www.youtube.com/watch?v=YOUR_VIDEO_ID) +> ``` + ## πŸš€ What is OpenDia? -OpenDia bridges your browser and AI models through the Model Context Protocol (MCP). It exposes 17 powerful browser automation tools that let AI models interact with web pages, manage tabs, extract content, and automate workflowsβ€”all with anti-detection capabilities for major platforms. +OpenDia lets AI models control your browser automatically. **The key advantage? It leverages everything you already have**β€”your logged-in accounts, saved passwords, cookies, wallets, and browsing history. No need to start from scratch or switch contexts. + +**πŸ”‘ Use Your Existing Digital Life:** +- βœ… **Logged-in accounts**: Post to Twitter, LinkedIn, Facebook with your existing sessions +- βœ… **Browser data**: Access your bookmarks, history, and saved passwords +- βœ… **Extensions & wallets**: Use MetaMask, password managers, or any installed extensions +- βœ… **Cookies & sessions**: Stay authenticated across all your favorite sites +- βœ… **Local testing**: Perfect for development with Cursor - test with real user sessions **✨ Key Benefits:** -- πŸ”„ **Universal AI Support**: Works with Sonnet, o3 and even local models +- πŸ”„ **Universal AI Support**: Works with Claude, ChatGPT, Cursor, and local models - 🎯 **Anti-Detection**: Specialized bypasses for Twitter/X, LinkedIn, Facebook -- πŸ“± **Smart Automation**: Two-phase intelligent page analysis -- πŸ›‘οΈ **Privacy-First**: Runs locally, your data stays with you -- ⚑ **Zero Setup**: Get started with `npx opendia` +- πŸ“± **Smart Automation**: AI understands your pages and finds the right elements +- πŸ›‘οΈ **Privacy-First**: Everything runs locally, your data stays with you +- ⚑ **Zero Setup**: Get started with one command -## 🎬 Demo Workflows +## 🌐 Browser Support -**Real workflows you can run today:** +Works with **any Chromium-based browser**: +- βœ… **Google Chrome** +- βœ… **Arc Browser** +- βœ… **Microsoft Edge** +- βœ… **Brave Browser** +- βœ… **Opera** +- βœ… **Vivaldi** +- βœ… **Any Chromium variant** + +Perfect for **Cursor users** who want to automate their local testing and development workflows! + +## 🎬 What You Can Do + +**Real workflows you can try today:** ### πŸ“° Content & Social Media -- **Daily Reading Summary**: Summarize articles you've read today & auto-post to Twitter -- **Smart Recommendations**: Get personalized content suggestions based on your browsing history -- **Bookmark Analysis**: Browse & summarize all your X/Twitter bookmarks -- **Article Research**: Ask questions about any webpage and get instant answers +- **"Summarize all the articles I read today and post a Twitter thread about the key insights"** +- **"Find interesting articles related to AI from my bookmarks and create a reading list"** +- **"Read this article and post a thoughtful comment on the LinkedIn version"** +- **"Check my recent Twitter bookmarks and summarize the main themes"** -### πŸ“§ Productivity -- **Email Intelligence**: Browse and analyze your latest emails -- **Tab Management**: Organize and switch between research sessions -- **History Mining**: Find that article you read last week with natural language queries -- **Content Extraction**: Turn any webpage into structured data +### πŸ“§ Productivity & Research +- **"Browse my latest emails and tell me what needs urgent attention"** +- **"Find all the GitHub repos I visited this week and create a summary report"** +- **"Extract the main points from this research paper and save them to my notes"** +- **"Search my browsing history for that article about AI safety I read last month"** -### πŸ€– Advanced Automation -- **Multi-tab Workflows**: Coordinate actions across multiple browser tabs -- **Form Filling**: Auto-fill forms with anti-detection on social platforms -- **Content Publishing**: Draft and publish social media posts with AI assistance -- **Research Automation**: Gather information from multiple sources automatically +### πŸ€– Development & Testing (Perfect for Cursor!) +- **"Test my web app's signup flow and take screenshots at each step"** +- **"Fill out this form with test data and check if validation works"** +- **"Navigate through my app and check if all the buttons work properly"** +- **"Use my connected wallet to test this DeFi interface"** + +### πŸ”„ Advanced Automation +- **"Open tabs for all my daily news sources and summarize the top stories"** +- **"Draft replies to my unread messages based on the context"** +- **"Monitor this webpage and notify me when the content changes"** +- **"Automatically bookmark interesting articles I'm reading"** ## ⚑ Quick Start -### Option 1: Instant Setup (Recommended) +### 1. Start the Server ```bash -# Start the MCP server (no installation required) npx opendia ``` -### Option 2: Global Installation -```bash -npm install -g opendia -opendia -``` - -### Setup the Chrome Extension +### 2. Install the Browser Extension 1. Download from [releases](https://github.com/aaronjmars/opendia/releases) -2. Go to `chrome://extensions/` +2. Go to `chrome://extensions/` (or your browser's extension page) 3. Enable "Developer mode" 4. Click "Load unpacked" and select the extension folder -### Connect to your AI client -Add to your MCP configuration: +### 3. Connect to Your AI +**For Claude Desktop**, add to your configuration: ```json { "mcpServers": { @@ -72,190 +102,114 @@ Add to your MCP configuration: } ``` -## πŸ› οΈ Available Tools (17 Total) +**For Cursor or other AI tools**, use the same configuration or follow their specific setup instructions. -### 🌐 Page Analysis & Interaction -| Tool | Description | Use Case | -|------|-------------|----------| -| `page_analyze` | Two-phase intelligent page analysis | Find elements to interact with | -| `page_extract_content` | Smart content extraction with summarization | Get article text, search results, social posts | -| `element_click` | Click elements with smart targeting | Press buttons, follow links | -| `element_fill` | Fill forms with anti-detection bypass | Post tweets, fill login forms | -| `element_get_state` | Check element properties and states | Verify if buttons are clickable | -| `page_navigate` | Navigate to URLs with wait conditions | Go to specific pages | -| `page_wait_for` | Wait for elements or conditions | Ensure page loads before actions | -| `page_scroll` | Scroll pages in any direction | Navigate long pages, infinite scroll | +## πŸ› οΈ Capabilities -### πŸ“‘ Tab Management -| Tool | Description | Use Case | -|------|-------------|----------| -| `tab_create` | Create new tabs with options | Open multiple research tabs | -| `tab_close` | Close specific or current tabs | Clean up workspace | -| `tab_list` | Get all open tabs with details | See what's currently open | -| `tab_switch` | Switch between tabs by ID | Navigate between research | +OpenDia gives AI models **17 powerful browser tools**: -### πŸ“Š Browser Data Access -| Tool | Description | Use Case | -|------|-------------|----------| -| `get_bookmarks` | Access and search bookmarks | Find saved resources | -| `add_bookmark` | Create new bookmarks | Save important pages | -| `get_history` | Search browser history with filters | Find previously visited content | -| `get_selected_text` | Get currently selected text | Work with highlighted content | -| `get_page_links` | Extract all links from current page | Analyze page structure, find resources | +### 🎯 Smart Page Understanding +- **Analyze any webpage** - AI automatically finds buttons, forms, and interactive elements +- **Extract content intelligently** - Get clean text from articles, social posts, or search results +- **Understand context** - AI knows what type of page it's looking at and how to interact with it -## 🎯 Anti-Detection Features +### πŸ–±οΈ Natural Interactions +- **Click anything** - Buttons, links, menus - AI finds and clicks the right elements +- **Fill forms smartly** - Works even on complex sites like Twitter, LinkedIn, Facebook +- **Navigate seamlessly** - Go to pages, scroll, wait for content to load +- **Handle modern web apps** - Bypasses detection on social platforms -OpenDia includes specialized bypasses for platforms that typically block automation: +### πŸ“‘ Tab & Window Management +- **Multi-tab workflows** - Open, close, switch between tabs automatically +- **Organize your workspace** - Let AI manage your browser tabs efficiently +- **Coordinate complex tasks** - Work across multiple sites simultaneously -- **🐦 Twitter/X**: Direct DOM manipulation bypassing React detection -- **πŸ’Ό LinkedIn**: Enhanced focus sequences for professional posting -- **πŸ“˜ Facebook**: Event simulation for social interactions -- **🌐 Universal**: Fallback methods for any website +### πŸ“Š Access Your Browser Data +- **Bookmarks & History** - Find that article you read last week +- **Current page content** - Get selected text, links, or full page content +- **Real-time information** - Work with whatever's currently on your screen -## πŸ—οΈ Architecture +### πŸ›‘οΈ Anti-Detection Features +- **Social media posting** - Bypass automation detection on Twitter/X, LinkedIn, Facebook +- **Natural interactions** - Mimics human behavior to avoid triggering security measures +- **Reliable automation** - Works consistently even on sites that block typical automation tools + +## πŸ’¬ Example Prompts to Try + +Once everything is set up, try asking your AI: + +**Content Creation:** +> *"Read the article on this page and create a Twitter thread summarizing the main points"* + +**Research & Analysis:** +> *"Look through my browser history from this week and find articles about machine learning. Summarize the key trends."* + +**Social Media Management:** +> *"Check my Twitter bookmarks and organize them into categories. Create a summary of each category."* + +**Productivity:** +> *"Open tabs for my usual morning reading sites and give me a briefing of today's top stories"* + +**Development Testing:** +> *"Fill out this contact form with test data and check if the submission works properly"* + +**Personal Assistant:** +> *"Find that GitHub repo I was looking at yesterday about React components and bookmark it for later"* + +## πŸ—οΈ How It Works ```mermaid graph LR - A[AI Model] <--> B[MCP Protocol] - B <--> C[OpenDia Server] - C <--> D[Chrome Extension] - D <--> E[Web Pages] + A[AI Model] --> B[OpenDia Server] + B --> C[Browser Extension] + C --> D[Your Browser] + D --> E[Any Website] ``` -## πŸ”§ Configuration - -### Environment Variables -```bash -PORT=3000 # WebSocket server port -HEALTH_PORT=3001 # Health check HTTP port -NODE_ENV=development -``` - -### CLI Options -```bash -npx opendia --help # Show help -npx opendia --port 3005 # Custom WebSocket port -npx opendia --health-port 3002 # Custom health check port -npx opendia --version # Show version -``` - -## 🚦 System Requirements - -- **Node.js**: Version 14 or higher -- **Browser**: Chrome or Arc (Chromium-based) -- **AI Client**: Any MCP-compatible client (Claude Desktop, custom implementations) - -## πŸ“– Examples - -### Extract and Summarize Current Page -```python -# Using the MCP tools -result = await call_tool("page_extract_content", { - "content_type": "article", - "summarize": True -}) -print(f"Summary: {result['summary']['preview']}") -``` - -### Auto-post to Twitter -```python -# Analyze page for tweet composition -elements = await call_tool("page_analyze", { - "intent_hint": "post_tweet", - "phase": "discover" -}) - -# Fill the tweet -await call_tool("element_fill", { - "element_id": elements['elements'][0]['id'], - "value": "Just discovered something amazing! πŸš€" -}) - -# Click tweet button -await call_tool("element_click", { - "element_id": elements['elements'][1]['id'] -}) -``` - -### Research Multiple Articles -```python -# Get browser history for today -history = await call_tool("get_history", { - "keywords": "AI research", - "start_date": "2024-01-01T00:00:00Z", - "max_results": 10 -}) - -# Visit each article and extract content -for item in history['history_items']: - await call_tool("page_navigate", {"url": item['url']}) - content = await call_tool("page_extract_content", { - "content_type": "article", - "summarize": True - }) - print(f"Article: {content['summary']['title']}") -``` +1. **You ask** your AI to do something browser-related +2. **AI calls** OpenDia tools to understand and interact with pages +3. **OpenDia controls** your browser through the extension +4. **You get results** - AI can see what happened and respond intelligently ## πŸ”’ Security & Privacy -**Important Security Notice**: This extension requires broad browser permissions and establishes localhost connections. Use responsibly: +**Your data stays private**: +- βœ… **Everything runs locally** - No cloud processing of your browsing data +- βœ… **You control access** - Extension only works when you want it to +- βœ… **Open source** - Full transparency of what the code does +- βœ… **No tracking** - We don't collect or store any of your information -- βœ… **Local-first**: All data processing happens on your machine -- βœ… **No tracking**: We don't collect or transmit your browsing data -- βœ… **Open source**: Full transparency of what the code does -- ⚠️ **Developer tool**: Intended for technical users who understand the risks -- ⚠️ **Localhost only**: WebSocket server binds to localhost by default +**Important**: This tool requires broad browser permissions to function. Only use with AI models you trust, and in environments where you're comfortable with browser automation. ## 🀝 Contributing -We welcome contributions! Here's how you can help: +Love to have your help making OpenDia better! -### Adding New Tools -1. Add tool definition in `background.js` β†’ `getAvailableTools()` -2. Implement handler in `handleMCPRequest()` -3. Test with the extension -4. Submit a PR - -### Development Setup +### Quick Development Setup ```bash -git clone https://github.com/yourusername/opendia.git +git clone https://github.com/aaronjmars/opendia.git cd opendia -# Start MCP server +# Start the server cd opendia-mcp npm install npm start -# Load extension in Chrome -# Go to chrome://extensions/ -# Enable Developer mode -# Load unpacked: ./opendia-extension +# Load extension in your browser +# Go to chrome://extensions/ β†’ Developer mode β†’ Load unpacked: ./opendia-extension ``` -### Roadmap -- [ ] Firefox extension support -- [ ] Additional social platform bypasses -- [ ] Visual element selector -- [ ] Workflow recorder/playback -- [ ] Enterprise authentication support +### Ways to Contribute +- πŸ› **Report bugs** via [GitHub Issues](https://github.com/aaronjmars/opendia/issues) +- πŸ’‘ **Suggest features** in [Discussions](https://github.com/aaronjmars/opendia/discussions) +- πŸ”§ **Add new browser capabilities** +- πŸ“– **Improve documentation** +- πŸ§ͺ **Test with different AI models** ## πŸ“ License MIT License - see [LICENSE](LICENSE) for details. -## πŸ†˜ Support - -- πŸ“– **Documentation**: [GitHub Wiki](https://github.com/aaronjmars/opendia/wiki) -- πŸ› **Issues**: [GitHub Issues](https://github.com/aaronjmars/opendia/issues) -- πŸ’¬ **Discussions**: [GitHub Discussions](https://github.com/aaronjmars/opendia/discussions) - -## πŸ™ Acknowledgments - -Built with: -- [Model Context Protocol](https://modelcontextprotocol.io/) by Anthropic -- Chrome Extensions API -- WebSocket for real-time communication - --- -**Made with ✨ by developers who believe AI should seamlessly integrate with your workflow.** \ No newline at end of file +**Ready to supercharge your browser with AI? Get started with `npx opendia`! πŸš€**