Skip to content

A sophisticated AI-powered voice assistant that enables natural conversational interactions through speech recognition and synthesis.

Notifications You must be signed in to change notification settings

bigdata5911/AI-Voice-Assitant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI Voice Assistant

A sophisticated AI-powered voice assistant that enables natural conversational interactions through speech recognition and synthesis. This assistant integrates multiple tools for productivity tasks including calendar management, contact handling, email composition, web search, and knowledge base access.

Overview

This project demonstrates advanced AI capabilities by combining:

  • Speech-to-Text (STT): Real-time voice input processing
  • Text-to-Speech (TTS): Natural voice response generation
  • Multi-tool Integration: Seamless access to productivity tools
  • Conversational AI: Natural language understanding and response

Core Features

Voice Interaction

  • Real-time speech recognition for natural conversation
  • High-quality text-to-speech synthesis
  • Seamless voice-based interaction with AI assistant

Productivity Tools

Tool Functionality
Calendar Management Schedule and manage Google Calendar events
Contact Management Add and retrieve Google Contacts information
Email Composition Send emails via Gmail integration
Web Search Real-time information retrieval via Tavily API
Knowledge Base Access personal documents and saved information

Technical Architecture

Prerequisites

  • Python 3.9+
  • Google API credentials (Calendar, Contacts, Gmail)
  • Tavily API key (web search)
  • Groq API key (LLM processing)
  • Google Gemini API key (alternative LLM)
  • Deepgram API key (voice processing)

Dependencies

All required packages are listed in requirements.txt and include:

  • Speech processing libraries
  • Google API clients
  • AI/ML frameworks
  • Web request handlers

Installation

1. Clone Repository

git clone https://github.com/bigdata5911/AI-Voice-Assistant.git
cd AI-Voice-Assistant

2. Environment Setup

python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt

3. Configuration

Create a .env file in the project root:

GOOGLE_API_KEY=your_google_api_key
DEEPGRAM_API_KEY=your_deepgram_api_key
TAVILY_API_KEY=your_tavily_api_key
GEMINI_API_KEY=your_gemini_api_key
GROQ_API_KEY=your_groq_api_key

4. Google API Setup

Configure Google API credentials for Calendar, Contacts, and Gmail services following Google's API documentation.

Usage

Starting the Assistant

python main.py

The assistant will initiate voice interaction and respond to natural language commands.

Example Commands

  • Calendar: "Schedule a meeting with John tomorrow at 2 PM"
  • Contacts: "Add contact Jane Doe, phone 555-1234"
  • Email: "Send email to Bob with subject 'Project Update'"
  • Search: "Search for latest AI news"
  • Knowledge: "What was the chocolate chip cookie recipe I saved?"

Project Structure

AI-Voice-Assistant/
├── main.py                 # Application entry point
├── requirements.txt        # Python dependencies
├── scripts/               # Utility scripts
├── src/
│   ├── agents/           # AI agent implementation
│   ├── prompts/          # Conversation prompts
│   ├── speech_processing/ # Voice processing modules
│   └── tools/            # Productivity tool integrations
└── README.md

Development

This project is actively maintained and welcomes contributions. Please ensure all code follows the established patterns and includes appropriate documentation.

License

This project is open source and available under the MIT License.


Maintained by bigdata5911

About

A sophisticated AI-powered voice assistant that enables natural conversational interactions through speech recognition and synthesis.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages