Innovate speech, automate downloads, elevate interactions!
- π Overview
- πΎ Features
- π Project Structure
- π Getting Started
- π Project Roadmap
- π° Contributing
- π License
- π Acknowledgments
MAMA-gpt is a versatile project that simplifies speech-to-text and text-to-speech tasks, supporting Bengali and English translations. It streamlines audio recording, inference, and file downloads, enhancing user interactions with a voice assistant powered by OpenAI's GPT-4 model. Ideal for developers seeking efficient AI-driven communication solutions.
Feature | Summary | |
---|---|---|
βοΈ | Architecture |
|
π© | Code Quality |
|
π | Documentation |
|
π | Integrations |
|
𧩠| Modularity |
|
π§ͺ | Testing |
|
β‘οΈ | Performance |
|
π‘οΈ | Security |
|
π¦ | Dependencies |
|
βββ MAMA-gpt/
βββ README.md
βββ S2TT.py
βββ T2ST.py
βββ assistant.py
βββ audio.py
βββ avatars
β βββ .DS_Store
β βββ asset1.jpg
β βββ asset1.mp4
βββ get_api.py
βββ gpt_4.py
βββ llm_py3
β βββ bin
β βββ pyvenv.cfg
β βββ share
βββ log.txt
βββ main.py
βββ queries
β βββ .DS_Store
β βββ q2.wav
βββ requirements.txt
βββ web.py
MAMA-GPT/
__root__
S2TT.py - Implement a function to run speech-to-text inference using a specified API endpoint
- The function utilizes the Gradio client to predict text from an input speech file, supporting translation between Bengali and English languages.audio.py - Enables recording and saving audio files in WAV format with specified parameters
- Uses PyAudio to capture audio input, store it in memory, and save it to a file
- Key features include setting audio format, channels, sample rate, and duration of recording
- Terminates the audio stream after saving the file.T2ST.py - Enables running text-to-speech inference using a specified API endpoint
- Utilizes the Gradio client to predict text translation from English to Bengali
- The function returns the inference result.web.py - Enables automated file downloads using Selenium WebDriver for both Chrome and Firefox browsers, specifying download directories and preferences
- The code initializes the WebDriver, navigates to the download URL, manages the download process, and closes the driver upon completion
- This functionality streamlines the process of downloading files programmatically within the project architecture.main.py - Implements a voice assistant system that records, transcribes, and responds to user queries using OpenAI
- The code orchestrates the assistant's functionalities, manages logging, and handles user interactions
- It also generates stylized ASCII art for user prompts and gracefully exits upon user interruption.get_api.py - Retrieve the API endpoint URL by scraping a specific webpage
- If successful, parse the HTML content to extract the desired URL structure
- If the request fails, display an error message with the corresponding status code.gpt_4.py - Enables communication with OpenAI's GPT-4 model by loading API keys from a secrets file
- Sets environment variables and initializes the OpenAI client for generating responses to user queries using the GPT-4 model.requirements.txt Manage project dependencies using the provided requirements.txt file to ensure proper library versions are installed for seamless integration and functionality within the codebase architecture. log.txt Generates a log file to track system events and errors, aiding in debugging and monitoring the project's performance. assistant.py - Enables a virtual assistant to record audio, convert speech to text using an API, generate a response using GPT-4, and translate the response to speech
- Additionally, it provides a method to log dialogues to a file.
queries
q2.wav - The provided code file serves as a crucial component within the codebase architecture, enabling seamless integration of external APIs to enhance the project's functionality
- It facilitates efficient communication with third-party services, ensuring the project can leverage external resources effectively
- This code file plays a key role in expanding the project's capabilities by enabling it to interact with various external systems and services.
llm_py3
share
man
man1
isympy.1 - The code file `isympy.1` provides an interactive shell for SymPy, facilitating quick experimentation with SymPy commands
- It serves as a user-friendly interface for executing common SymPy commands without manual input
- The file offers various options for customizing the shell environment, enhancing the user experience and enabling efficient exploration of SymPy functionalities.bin
openai Execute Python script to run OpenAI CLI for the project, adjusting sys.argv for compatibility. httpx Executes the Python script for the HTTPX module, handling command-line arguments and launching the main function. convert-caffe2-to-onnx - Converts Caffe2 models to ONNX format using a shell script
- The script invokes a Python function for the conversion process.pip3 - Facilitates execution of Python scripts using pip3 command by invoking the main function from the pip package
- The code sets up necessary configurations and arguments for seamless operation within the project architecture.pip3.9 - Facilitates execution of Python scripts using pip3.9 within the llm_py3 project directory
- Adjusts sys.argv for proper script execution and invokes the main function from pip's internal CLI
- This script streamlines package management tasks within the project architecture.huggingface-cli Executes the Hugging Face CLI command using Python3, handling script execution and importing necessary modules. torchrun - Facilitates running distributed training for the project by invoking the main function from torch.distributed.run
- The script adjusts sys.argv and exits with the main function's result.distro Executes Python code for the main distro functionality, handling command-line arguments and invoking the main function. activate - Activate script sets up the virtual environment for the project by configuring environment variables and paths
- It ensures a clean environment for running project-specific dependencies and commands.Activate.ps1 - Enables activation of Python virtual environments in PowerShell sessions by updating the PATH variable and setting a custom prompt
- Parses configuration values from `pyvenv.cfg` for customization
- Deactivates any active virtual environment before activation
- This script streamlines virtual environment management for enhanced development workflows.isympy - The code file `isympy` in the project architecture serves as an entry point for executing the main functionality of the isympy module
- It handles command-line arguments, processes input, and triggers the main function to execute the desired operations within the project.activate.csh - Activate and configure the Python 3.3 virtual environment for the project, setting necessary environment variables and aliases
- This script, when sourced, adjusts the PATH and prompt to reflect the virtual environment, enabling seamless Python development within the project structure.convert-onnx-to-caffe2 Converts ONNX models to Caffe2 format for seamless integration within the project architecture. dotenv Execute Python script to manage environment variables using the dotenv library. activate.fish - Improve shell environment by deactivating virtual environment, resetting variables, and updating paths
- Set up prompt customization for fish shell.pip - Facilitates execution of Python scripts using the pip package manager by invoking the main function
- The script adjusts system arguments and exits upon completion, ensuring seamless integration with the project's architecture.normalizer - Detects and normalizes character encoding in text data using the charset_normalizer library
- The script executes Python 3 to identify and fix encoding issues, ensuring consistent and accurate text processing within the project architecture.tqdm - Executes Python code using a shell script to run the tqdm CLI tool
- Modifies sys.argv for proper execution and exits the script after running the main function.
Before getting started with MAMA-gpt, ensure your runtime environment meets the following requirements:
- Programming Language: Python
- Package Manager: Pip
Install MAMA-gpt using one of the following methods:
Build from source:
- Clone the MAMA-gpt repository:
β― git clone https://github.com/nafis-neehal/MAMA-gpt
- Navigate to the project directory:
β― cd MAMA-gpt
- Install the project dependencies:
β― pip install -r requirements.txt
Run MAMA-gpt using the following command:
Using pip
Β
β― python3 main.py
-
Task 1
:Voice Communication with GPT-4 in Bengali using Meta Seamless M4T V2 Large -
Task 2
: Gradio UI. -
Task 3
: Live Demo.
- π¬ Join the Discussions: Share your insights, provide feedback, or ask questions.
- π Report Issues: Submit bugs found or log feature requests for the
MAMA-gpt
project. - π‘ Submit Pull Requests: Review open PRs, and submit your own PRs.
Contributing Guidelines
- Fork the Repository: Start by forking the project repository to your github account.
- Clone Locally: Clone the forked repository to your local machine using a git client.
git clone https://github.com/nafis-neehal/MAMA-gpt
- Create a New Branch: Always work on a new branch, giving it a descriptive name.
git checkout -b new-feature-x
- Make Your Changes: Develop and test your changes locally.
- Commit Your Changes: Commit with a clear message describing your updates.
git commit -m 'Implemented new feature x.'
- Push to github: Push the changes to your forked repository.
git push origin new-feature-x
- Submit a Pull Request: Create a PR against the original project repository. Clearly describe the changes and their motivations.
- Review: Once your PR is reviewed and approved, it will be merged into the main branch. Congratulations on your contribution!
This project is protected under the SELECT-A-LICENSE License. For more details, refer to the LICENSE file.
- List any resources, contributors, inspiration, etc. here.