Multimodal AI Assistant
is a free AI chatbot crafted to facilitate local interactions with various LLMs, documents, and a range of advanced functionalities. It leverages the Chainlit
Framework, along with Agents and Chains, to enhance the user experience.
Harnessing models via Langchain
and ChatOllama
, hosted locally with Ollama
, offers numerous benefits in terms of customization. This setup allows for tailored model configurations that align with specific project requirements, enabling developers to fine-tune performance and functionalities.
Literalai
provides seamless integration for data storage and observability, ensuring efficient management and monitoring of your AI model's performance. With built-in support for handling large datasets and real-time analytics, Literalai allows you to track, store, and visualize key metrics, offering deep insights into your models' behavior and outcomes. This robust storage system simplifies debugging and optimizes performance, making it a crucial tool for developing scalable AI applications.
-
Chat with Documents 📄
Interact with various document formats, including
PDF
,TXT
,.PY
,DOCX
,images
andCSV
files, using aRetrieval Augmented Generation
(RAG) approach.
-
Speech to Text and Text to Speech 🗣️
Utilize multimodal capabilities of
speech-to-text
andtext-to-speech
to interact with the assistant via both text and audio.
-
Image Generation 🖼️
Generate images locally using
StableDiffusion
with the CompVis/stable-diffusion-v1-4 model, downloaded using thediffusers
library.
-
Wikipedia Search 🔍
Perform searches on
Wikipedia
by entering keywords after the first word of the command, which must beWikipedia
orwikipedia
, to trigger the functionality.
-
HTML Web Page Scraper 🌐
Extract HTML code from a Web Page using
BeautifulSoup
.
-
DuckDuckGo Web Search 🔎
Conduct searches directly on the
DuckDuckGo
browser, returning the first 10 search results.
-
Resume Chats 💬🔄
Users can effortlessly retrieve and continue previous chat sessions, ensuring a smooth conversation flow.
-
Literal Ai Storage anche chats ☁️👁️
Integration with Literal AI enables efficient cloud storage of chat histories, allowing quick access to past interactions.
To get started, follow these simple steps:
-
Clone the repository
Run the following command to clone the repository:git clone https://github.com/chrisputzu/multimodal-ai-assistant.git
Then navigate into the project directory:
cd multimodal-ai-assistant
-
Install dependencies
Install the required packages by running:pip install -r requirements.txt
-
Download and Install Ollama
Ollama is required for running models locally. Follow these instructions for your operating system:- Windows/Linux/macOS:
Visit the Ollama Download Page and follow the instructions to install Ollama on your machine.
- Windows/Linux/macOS:
-
Download Local LLM
After installing Ollama, you will need to download the models. Use the following commands:
- For Llama3.1:
ollama pull llama3.1
- For Llama3.2:
ollama pull llama3.2
- For LLava:
ollama pull llava-llama3
- For CodeOLLama:
ollama pull codellama:7b
- Per MistralNemo-12b:
ollama pull mistral-nemo
- Per Gemma-2:
ollama pull gemma2
- Per Qwen2.5-7b:
ollama pull qwen2.5:7b
- Per Phi-3.5:
ollama pull phi3.5
- For Llama3.1:
-
Literal
To create an account on Literal AI, visit the website and click on Sign Up. After confirming your email, log in to the dashboard and create a new project by clicking on Create Project. Finally, navigate to Settings, locate the API Keys section, and click on Generate API Key to obtain your API key.
-
Configure your environment
Set up your Chainlit API by running the following command in your terminal:
chainlit create-secret
Create a
.env
file in the root directory and add your previously created API keys:CHAINLIT_API_KEY=<YOUR_CHAINLIT_API_KEY>
LITERAL_API_KEY=<YOUR_LITERAL_API_KEY>
-
Run the application
Start the assistant with:
chainlit run app.py --port <YOUR_PORT>
This will open the localhost page
localhost:<YOUR_PORT>
.
For inquiries, feel free to reach out via email:
Email: [email protected]