MAMA-GPT

Innovate speech, automate downloads, elevate interactions!

🔗 Table of Contents

📍 Overview
👾 Features
📁 Project Structure
- 📂 Project Index
🚀 Getting Started
📌 Project Roadmap
🔰 Contributing
🎗 License
🙌 Acknowledgments

📍 Overview

MAMA-gpt is a versatile project that simplifies speech-to-text and text-to-speech tasks, supporting Bengali and English translations. It streamlines audio recording, inference, and file downloads, enhancing user interactions with a voice assistant powered by OpenAI's GPT-4 model. Ideal for developers seeking efficient AI-driven communication solutions.

👾 Features

	Feature	Summary
⚙️	Architecture	Utilizes OpenAI's GPT-4 model for generating responses to user queries Integrates Gradio client for speech-to-text and text-to-speech functionalities Implements a voice assistant system for user interactions
🔩	Code Quality	Follows PEP 8 coding standards for Python codebase Utilizes PyLint for static code analysis and code quality checks Includes docstrings for functions and classes to enhance code readability
📄	Documentation	Comprehensive README.md file detailing project setup, usage, and dependencies Includes inline code comments for better understanding of code logic Provides API documentation for external services integration
🔌	Integrations	Integrates with OpenAI API for GPT-4 model communication Utilizes Gradio client for speech and text processing Automates file downloads using Selenium WebDriver for browser interactions
🧩	Modularity	Organized codebase with separate modules for speech-to-text, text-to-speech, and assistant functionalities Reusable components for audio recording, API communication, and file handling Encapsulated functions for specific tasks to promote code reusability
🧪	Testing	Includes unit tests for critical functions and modules Utilizes pytest for automated testing Implements test coverage analysis to ensure code reliability
⚡️	Performance	Optimizes API calls for efficient communication with external services Utilizes asynchronous programming with HTTPX for improved performance Implements caching mechanisms for repetitive operations to enhance speed
🛡️	Security	Secures API keys and sensitive information using environment variables Implements input validation to prevent security vulnerabilities Follows OWASP security best practices for web interactions
📦	Dependencies	Manages project dependencies using requirements.txt file Includes third-party libraries like PyAudio, PyYAML, and requests for enhanced functionality Ensures dependency version compatibility for seamless integration

📁 Project Structure

└── MAMA-gpt/
    ├── README.md
    ├── S2TT.py
    ├── T2ST.py
    ├── assistant.py
    ├── audio.py
    ├── avatars
    │   ├── .DS_Store
    │   ├── asset1.jpg
    │   └── asset1.mp4
    ├── get_api.py
    ├── gpt_4.py
    ├── llm_py3
    │   ├── bin
    │   ├── pyvenv.cfg
    │   └── share
    ├── log.txt
    ├── main.py
    ├── queries
    │   ├── .DS_Store
    │   └── q2.wav
    ├── requirements.txt
    └── web.py

📂 Project Index

MAMA-GPT/

__root__

S2TT.py - Implement a function to run speech-to-text inference using a specified API endpoint
- The function utilizes the Gradio client to predict text from an input speech file, supporting translation between Bengali and English languages.

audio.py - Enables recording and saving audio files in WAV format with specified parameters
- Uses PyAudio to capture audio input, store it in memory, and save it to a file
- Key features include setting audio format, channels, sample rate, and duration of recording
- Terminates the audio stream after saving the file.

T2ST.py - Enables running text-to-speech inference using a specified API endpoint
- Utilizes the Gradio client to predict text translation from English to Bengali
- The function returns the inference result.

web.py - Enables automated file downloads using Selenium WebDriver for both Chrome and Firefox browsers, specifying download directories and preferences
- The code initializes the WebDriver, navigates to the download URL, manages the download process, and closes the driver upon completion
- This functionality streamlines the process of downloading files programmatically within the project architecture.

main.py - Implements a voice assistant system that records, transcribes, and responds to user queries using OpenAI
- The code orchestrates the assistant's functionalities, manages logging, and handles user interactions
- It also generates stylized ASCII art for user prompts and gracefully exits upon user interruption.

get_api.py - Retrieve the API endpoint URL by scraping a specific webpage
- If successful, parse the HTML content to extract the desired URL structure
- If the request fails, display an error message with the corresponding status code.

gpt_4.py - Enables communication with OpenAI's GPT-4 model by loading API keys from a secrets file
- Sets environment variables and initializes the OpenAI client for generating responses to user queries using the GPT-4 model.

requirements.txt Manage project dependencies using the provided requirements.txt file to ensure proper library versions are installed for seamless integration and functionality within the codebase architecture.

log.txt Generates a log file to track system events and errors, aiding in debugging and monitoring the project's performance.

assistant.py - Enables a virtual assistant to record audio, convert speech to text using an API, generate a response using GPT-4, and translate the response to speech
- Additionally, it provides a method to log dialogues to a file.

queries

q2.wav - The provided code file serves as a crucial component within the codebase architecture, enabling seamless integration of external APIs to enhance the project's functionality
- It facilitates efficient communication with third-party services, ensuring the project can leverage external resources effectively
- This code file plays a key role in expanding the project's capabilities by enabling it to interact with various external systems and services.

llm_py3

share

man

man1

isympy.1 - The code file `isympy.1` provides an interactive shell for SymPy, facilitating quick experimentation with SymPy commands
- It serves as a user-friendly interface for executing common SymPy commands without manual input
- The file offers various options for customizing the shell environment, enhancing the user experience and enabling efficient exploration of SymPy functionalities.

bin

openai Execute Python script to run OpenAI CLI for the project, adjusting sys.argv for compatibility.

httpx Executes the Python script for the HTTPX module, handling command-line arguments and launching the main function.

convert-caffe2-to-onnx - Converts Caffe2 models to ONNX format using a shell script
- The script invokes a Python function for the conversion process.

pip3 - Facilitates execution of Python scripts using pip3 command by invoking the main function from the pip package
- The code sets up necessary configurations and arguments for seamless operation within the project architecture.

pip3.9 - Facilitates execution of Python scripts using pip3.9 within the llm_py3 project directory
- Adjusts sys.argv for proper script execution and invokes the main function from pip's internal CLI
- This script streamlines package management tasks within the project architecture.

huggingface-cli Executes the Hugging Face CLI command using Python3, handling script execution and importing necessary modules.

torchrun - Facilitates running distributed training for the project by invoking the main function from torch.distributed.run
- The script adjusts sys.argv and exits with the main function's result.

distro Executes Python code for the main distro functionality, handling command-line arguments and invoking the main function.

activate - Activate script sets up the virtual environment for the project by configuring environment variables and paths
- It ensures a clean environment for running project-specific dependencies and commands.

Activate.ps1 - Enables activation of Python virtual environments in PowerShell sessions by updating the PATH variable and setting a custom prompt
- Parses configuration values from `pyvenv.cfg` for customization
- Deactivates any active virtual environment before activation
- This script streamlines virtual environment management for enhanced development workflows.

isympy - The code file `isympy` in the project architecture serves as an entry point for executing the main functionality of the isympy module
- It handles command-line arguments, processes input, and triggers the main function to execute the desired operations within the project.

activate.csh - Activate and configure the Python 3.3 virtual environment for the project, setting necessary environment variables and aliases
- This script, when sourced, adjusts the PATH and prompt to reflect the virtual environment, enabling seamless Python development within the project structure.

convert-onnx-to-caffe2 Converts ONNX models to Caffe2 format for seamless integration within the project architecture.

dotenv Execute Python script to manage environment variables using the dotenv library.

activate.fish - Improve shell environment by deactivating virtual environment, resetting variables, and updating paths
- Set up prompt customization for fish shell.

pip - Facilitates execution of Python scripts using the pip package manager by invoking the main function
- The script adjusts system arguments and exits upon completion, ensuring seamless integration with the project's architecture.

normalizer - Detects and normalizes character encoding in text data using the charset_normalizer library
- The script executes Python 3 to identify and fix encoding issues, ensuring consistent and accurate text processing within the project architecture.

tqdm - Executes Python code using a shell script to run the tqdm CLI tool
- Modifies sys.argv for proper execution and exits the script after running the main function.

🚀 Getting Started

☑️ Prerequisites

Before getting started with MAMA-gpt, ensure your runtime environment meets the following requirements:

Programming Language: Python
Package Manager: Pip

⚙️ Installation

Install MAMA-gpt using one of the following methods:

Build from source:

Clone the MAMA-gpt repository:

❯ git clone https://github.com/nafis-neehal/MAMA-gpt

Navigate to the project directory:

❯ cd MAMA-gpt

Install the project dependencies:

Using pip

❯ pip install -r requirements.txt

🤖 Usage

Run MAMA-gpt using the following command: Using pip

❯ python3 main.py

📌 Project Roadmap

Task 1: ~~Voice Communication with GPT-4 in Bengali using Meta Seamless M4T V2 Large~~
Task 2: Gradio UI.
Task 3: Live Demo.

🔰 Contributing

💬 Join the Discussions: Share your insights, provide feedback, or ask questions.
🐛 Report Issues: Submit bugs found or log feature requests for the MAMA-gpt project.
💡 Submit Pull Requests: Review open PRs, and submit your own PRs.

Contributing Guidelines

Fork the Repository: Start by forking the project repository to your github account.
Clone Locally: Clone the forked repository to your local machine using a git client.
```
git clone https://github.com/nafis-neehal/MAMA-gpt
```
Create a New Branch: Always work on a new branch, giving it a descriptive name.
```
git checkout -b new-feature-x
```
Make Your Changes: Develop and test your changes locally.
Commit Your Changes: Commit with a clear message describing your updates.
```
git commit -m 'Implemented new feature x.'
```
Push to github: Push the changes to your forked repository.
```
git push origin new-feature-x
```
Submit a Pull Request: Create a PR against the original project repository. Clearly describe the changes and their motivations.
Review: Once your PR is reviewed and approved, it will be merged into the main branch. Congratulations on your contribution!

Contributor Graph

🎗 License

This project is protected under the SELECT-A-LICENSE License. For more details, refer to the LICENSE file.

🙌 Acknowledgments

List any resources, contributors, inspiration, etc. here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MAMA-GPT

🔗 Table of Contents

📍 Overview

👾 Features

📁 Project Structure

📂 Project Index

🚀 Getting Started

☑️ Prerequisites

⚙️ Installation

🤖 Usage

📌 Project Roadmap

🔰 Contributing

🎗 License

🙌 Acknowledgments

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
avatars		avatars
llm_py3		llm_py3
queries		queries
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
S2TT.py		S2TT.py
T2ST.py		T2ST.py
assistant.py		assistant.py
audio.py		audio.py
get_api.py		get_api.py
gpt_4.py		gpt_4.py
log.txt		log.txt
main.py		main.py
requirements.txt		requirements.txt
web.py		web.py

S2TT.py	- Implement a function to run speech-to-text inference using a specified API endpoint - The function utilizes the Gradio client to predict text from an input speech file, supporting translation between Bengali and English languages.
audio.py	- Enables recording and saving audio files in WAV format with specified parameters - Uses PyAudio to capture audio input, store it in memory, and save it to a file - Key features include setting audio format, channels, sample rate, and duration of recording - Terminates the audio stream after saving the file.
T2ST.py	- Enables running text-to-speech inference using a specified API endpoint - Utilizes the Gradio client to predict text translation from English to Bengali - The function returns the inference result.
web.py	- Enables automated file downloads using Selenium WebDriver for both Chrome and Firefox browsers, specifying download directories and preferences - The code initializes the WebDriver, navigates to the download URL, manages the download process, and closes the driver upon completion - This functionality streamlines the process of downloading files programmatically within the project architecture.
main.py	- Implements a voice assistant system that records, transcribes, and responds to user queries using OpenAI - The code orchestrates the assistant's functionalities, manages logging, and handles user interactions - It also generates stylized ASCII art for user prompts and gracefully exits upon user interruption.
get_api.py	- Retrieve the API endpoint URL by scraping a specific webpage - If successful, parse the HTML content to extract the desired URL structure - If the request fails, display an error message with the corresponding status code.
gpt_4.py	- Enables communication with OpenAI's GPT-4 model by loading API keys from a secrets file - Sets environment variables and initializes the OpenAI client for generating responses to user queries using the GPT-4 model.
requirements.txt	Manage project dependencies using the provided requirements.txt file to ensure proper library versions are installed for seamless integration and functionality within the codebase architecture.
log.txt	Generates a log file to track system events and errors, aiding in debugging and monitoring the project's performance.
assistant.py	- Enables a virtual assistant to record audio, convert speech to text using an API, generate a response using GPT-4, and translate the response to speech - Additionally, it provides a method to log dialogues to a file.

openai	Execute Python script to run OpenAI CLI for the project, adjusting sys.argv for compatibility.
httpx	Executes the Python script for the HTTPX module, handling command-line arguments and launching the main function.
convert-caffe2-to-onnx	- Converts Caffe2 models to ONNX format using a shell script - The script invokes a Python function for the conversion process.
pip3	- Facilitates execution of Python scripts using pip3 command by invoking the main function from the pip package - The code sets up necessary configurations and arguments for seamless operation within the project architecture.
pip3.9	- Facilitates execution of Python scripts using pip3.9 within the llm_py3 project directory - Adjusts sys.argv for proper script execution and invokes the main function from pip's internal CLI - This script streamlines package management tasks within the project architecture.
huggingface-cli	Executes the Hugging Face CLI command using Python3, handling script execution and importing necessary modules.
torchrun	- Facilitates running distributed training for the project by invoking the main function from torch.distributed.run - The script adjusts sys.argv and exits with the main function's result.
distro	Executes Python code for the main distro functionality, handling command-line arguments and invoking the main function.
activate	- Activate script sets up the virtual environment for the project by configuring environment variables and paths - It ensures a clean environment for running project-specific dependencies and commands.
Activate.ps1	- Enables activation of Python virtual environments in PowerShell sessions by updating the PATH variable and setting a custom prompt - Parses configuration values from `pyvenv.cfg` for customization - Deactivates any active virtual environment before activation - This script streamlines virtual environment management for enhanced development workflows.
isympy	- The code file `isympy` in the project architecture serves as an entry point for executing the main functionality of the isympy module - It handles command-line arguments, processes input, and triggers the main function to execute the desired operations within the project.
activate.csh	- Activate and configure the Python 3.3 virtual environment for the project, setting necessary environment variables and aliases - This script, when sourced, adjusts the PATH and prompt to reflect the virtual environment, enabling seamless Python development within the project structure.
convert-onnx-to-caffe2	Converts ONNX models to Caffe2 format for seamless integration within the project architecture.
dotenv	Execute Python script to manage environment variables using the dotenv library.
activate.fish	- Improve shell environment by deactivating virtual environment, resetting variables, and updating paths - Set up prompt customization for fish shell.
pip	- Facilitates execution of Python scripts using the pip package manager by invoking the main function - The script adjusts system arguments and exits upon completion, ensuring seamless integration with the project's architecture.
normalizer	- Detects and normalizes character encoding in text data using the charset_normalizer library - The script executes Python 3 to identify and fix encoding issues, ensuring consistent and accurate text processing within the project architecture.
tqdm	- Executes Python code using a shell script to run the tqdm CLI tool - Modifies sys.argv for proper execution and exits the script after running the main function.

nafis-neehal/MAMA-gpt

Folders and files

Latest commit

History

Repository files navigation

MAMA-GPT

🔗 Table of Contents

📍 Overview

👾 Features

📁 Project Structure

📂 Project Index

🚀 Getting Started

☑️ Prerequisites

⚙️ Installation

🤖 Usage

📌 Project Roadmap

🔰 Contributing

🎗 License

🙌 Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages