A powerful browser assistant with embedded AI capabilities. This project combines a React frontend, Tauri Rust backend, and Node.js agent to create an intelligent browser experience that can answer questions about web pages, interact with page content, and extract information.
- Embedded browser with navigation controls
- AI assistant that answers questions about web content
- DOM interaction tools (click buttons, search content, extract data)
- Chat interface with syntax highlighting for agent responses
- Responsive UI with collapsible sidebar
- ReAct agent pattern for step-by-step reasoning
Before you begin, ensure you have the following installed:
- Node.js (v16 or later)
- Rust
- Tauri CLI (
npm install -g @tauri-apps/cli
) - An OpenAI API Key (for the agent functionality)
git clone https://github.com/AIAnytime/agent-browser.git
cd agent-browser
# Install frontend dependencies
npm install
# Install backend dependencies
cd backend
npm install
cd ..
Create a .env
file in the project root with the following content:
OPENAI_API_KEY=your_openai_api_key_here
From the project root directory:
# Development mode with hot-reloading
npm run tauri dev
# Or build for production
npm run tauri build
- When the application starts, you'll see an embedded browser and a sidebar.
- Use the navigation bar at the top to enter URLs and browse the web.
- The sidebar contains the AI assistant chat interface.
- While browsing, type your question in the chat input at the bottom of the sidebar.
- The AI will analyze the current webpage and respond with relevant information.
- For specific content, you can also right-click on selected text to ask about it directly.
The AI assistant can use several tools to interact with web pages:
- search_dom: Search the page for specific content
- click_button: Click buttons or links on the page
- scrape_table: Extract table data from the page
- extract_prices: Find price information on the page
- navigate_to: Navigate to a different URL
agent-browser/
├── src/ # React frontend code
│ ├── App.tsx # Main application component
│ └── App.css # Styles
├── src-tauri/ # Rust backend code
│ └── src/
│ └── main.rs # Tauri application entry point
├── backend/ # Node.js agent code
│ ├── agent.js # ReAct agent implementation
│ └── package.json # Node dependencies
└── package.json # Frontend dependencies
- Agent not responding: Ensure your OpenAI API key is set correctly in the
.env
file. - DOM interaction issues: Check the browser console for errors related to DOM interaction tools.
- Build errors: Verify that you have the correct versions of Node.js, Rust, and Tauri CLI installed.
This browser uses the ReAct (Reasoning + Acting) pattern for AI agents:
- Thought: The AI analyzes the current context and formulates a plan
- Action: The AI executes a tool to gather information or modify the page
- Observation: The AI receives feedback from the executed tool
- Final Answer: After sufficient reasoning, the AI provides a comprehensive answer
Example interaction flow:
User: Find the cheapest flight on this page.
AI Thought: I need to read all prices on the page.
Action: extract_prices()
Observation: $99, $105, $129
Thought: $99 is cheapest. I will return it.
Final Answer: The cheapest flight is $99.
Layer | Technology | Role |
---|---|---|
UI Shell | Tauri + React | Browser window and interface |
Backend | Rust | IPC coordination, command execution |
Agent | Node.js + OpenAI API | ReAct agent implementation |
Web View | Tauri WebView | Embedded browser functionality |
Communication | Tauri IPC Events | Between React UI, Rust backend, and agent |
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add some amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI for the GPT models powering the agent
- Tauri for the cross-platform windowing solution
- React for the frontend framework