1 Oct 2023

Developing a Voice Assistant with Python and Speech Recognition

In this age of rapid technological advancements, voice assistants have become an integral part of our daily lives. From setting reminders and playing music to answering queries and controlling smart home devices, voice assistants have made our lives more convenient and efficient. While popular commercial voice assistants like Siri, Alexa, and Google Assistant dominate the market, you can create your own personalized voice assistant using Python and speech recognition libraries.

In this blog, we'll guide you through the process of building a basic voice assistant using Python and speech recognition. By the end of this tutorial, you'll have your very own personal AI to assist you in various tasks.

Prerequisites

Before we dive into the development process, make sure you have the following prerequisites installed on your system

  1. Python: Ensure you have Python 3.x installed on your computer. You can download it from the official Python website (https://www.python.org/downloads/).
  2. pip: pip is the package installer for Python. Most Python installations come with pip pre-installed. You can check if you have pip by running the command pip --version in your terminal or command prompt.
  3. Speech Recognition Library: We'll use the SpeechRecognition library to capture and interpret speech. You can install it using pip with the command: pip install SpeechRecognition.
  4. PyAudio Library: To capture audio input from your microphone, we'll use the PyAudio library. Install it using pip with the command: pip install pyaudio.
  5. gTTS (Google Text-to-Speech): This library will be used to convert text responses to speech. Install it using pip with the command: pip install gtts.
  6. Playsound Library: To play the voice responses, we'll use the Playsound library. Install it using pip with the command: pip install playsound.

Setting up the Project

Let's start by creating a new directory for our project. You can name it anything you like. Within the project directory, create a new Python script file named voice_assistant.py.

Importing Required Libraries

Open voice_assistant.py in your favorite code editor and import the necessary libraries

import speech_recognition as sr
from gtts import gTTS
import playsound
import os

Initializing the Speech Recognizer

Next, we'll initialize the SpeechRecognizer object from the SpeechRecognition library and set it as recognizer. We'll also define a function to capture audio input from the user.

def listen_to_user():
    recognizer = sr.Recognizer()
    with sr.Microphone() as source:
        print("Listening...")
        recognizer.pause_threshold = 1
        audio = recognizer.listen(source)

    try:
        print("Recognizing...")
        user_input = recognizer.recognize_google(audio).lower()
        print(f"User: {user_input}")
        return user_input
    except sr.UnknownValueError:
        print("Sorry, I didn't understand that.")
        return ""
    except sr.RequestError:
        print("Sorry, there was an issue connecting to the speech recognition service.")
      return ""

Adding Basic Commands

Now, let's implement some basic commands for our voice assistant. We'll start by defining a function that processes user input and generates the appropriate response.

def process_user_input(user_input):
    if "hello" in user_input:
        speak("Hello! How can I assist you today?")
    elif "how are you" in user_input:
        speak("I'm just a program, but thank you for asking.")
    elif "what's your name" in user_input:
        speak("I am your Personal Assistant. You can call me AI.")
    elif "thank you" in user_input:
        speak("You're welcome!")
    elif "goodbye" in user_input or "bye" in user_input:
        speak("Goodbye! Have a great day.")
        exit()
    else:
      speak("I'm sorry, I didn't catch that. Can you please repeat?")

Implementing Text-to-Speech

To enable our voice assistant to speak, we'll define a function `speak` that takes the response text as input and converts it to speech using the gTTS library. We'll also use the Playsound library to play the generated speech.

def speak(response_text):
    print(f"AI: {response_text}")
    tts = gTTS(text=response_text, lang='en')
    tts.save("response.mp3")
    playsound.playsound("response.mp3", True)
  os.remove("response.mp3")

Putting It All Together

Now, let's put all the pieces together in the main part of our script. We'll create a loop that continuously listens to the user's input, processes it, and generates a response.

if __name__ == "__main__":
    speak("Hello! I am your Personal Assistant. How can I assist you today?")
    while True:
        user_input = listen_to_user()
        if user_input:
          process_user_input(user_input)

Testing the Voice Assistant

Save the changes to voice_assistant.py and open your terminal or command prompt. Navigate to the project directory and run the script with the command: python voice_assistant.py.

Now, your voice assistant should be up and running! Start by saying "Hello" to trigger the greeting message, and then try other commands like "How are you?", "What's your name?", "Thank you", and "Goodbye" to see how your personal AI responds.

Conclusion

Congratulations! You've successfully developed a basic voice assistant using Python and speech recognition libraries. You can further enhance your assistant by adding more commands, integrating it with APIs, and exploring natural language processing techniques.