Skip to content

Multimodal AI Assistant is an advanced chatbot designed for local interactions with LLMs, files, and multimodal functionalities. Built on the Chainlit framework, it integrates Langchain and ChatOllama for hosting models locally, with cloud storage and monitoring via LiteralAI.

Notifications You must be signed in to change notification settings

chrisputzu/multimodal-ai-assistant

Repository files navigation

logo

Multimodal AI Assistant! 🚀🤖

Multimodal AI Assistant is a free AI chatbot crafted to facilitate local interactions with various LLMs, documents, and a range of advanced functionalities. It leverages the Chainlit Framework, along with Agents and Chains, to enhance the user experience.

Alt text

Local LLMs 🔒

Harnessing models via Langchain and ChatOllama, hosted locally with Ollama, offers numerous benefits in terms of customization. This setup allows for tailored model configurations that align with specific project requirements, enabling developers to fine-tune performance and functionalities.

Literalai cloud storage and observability ☁️👁️

Literalai provides seamless integration for data storage and observability, ensuring efficient management and monitoring of your AI model's performance. With built-in support for handling large datasets and real-time analytics, Literalai allows you to track, store, and visualize key metrics, offering deep insights into your models' behavior and outcomes. This robust storage system simplifies debugging and optimizes performance, making it a crucial tool for developing scalable AI applications.

Key Features 🌟

  1. Chat with Documents 📄

    Interact with various document formats, including PDF, TXT, .PY, DOCX, images and CSV files, using a Retrieval Augmented Generation (RAG) approach.

alt text alt text alt text

  1. Speech to Text and Text to Speech 🗣️

    Utilize multimodal capabilities of speech-to-text and text-to-speech to interact with the assistant via both text and audio.

Alt text

  1. Image Generation 🖼️

    Generate images locally using StableDiffusion with the CompVis/stable-diffusion-v1-4 model, downloaded using the diffusers library.

Alt text

  1. Wikipedia Search 🔍

    Perform searches on Wikipedia by entering keywords after the first word of the command, which must be Wikipedia or wikipedia, to trigger the functionality.

Alt text

  1. HTML Web Page Scraper 🌐

    Extract HTML code from a Web Page using BeautifulSoup.

Alt text

  1. DuckDuckGo Web Search 🔎

    Conduct searches directly on the DuckDuckGo browser, returning the first 10 search results.

Alt text

  1. Resume Chats 💬🔄

    Users can effortlessly retrieve and continue previous chat sessions, ensuring a smooth conversation flow.

alt text

  1. Literal Ai Storage anche chats ☁️👁️

    Integration with Literal AI enables efficient cloud storage of chat histories, allowing quick access to past interactions.

alt text

Setup Instructions ⚙️

To get started, follow these simple steps:

  1. Clone the repository
    Run the following command to clone the repository:

    git clone https://github.com/chrisputzu/multimodal-ai-assistant.git

    Then navigate into the project directory:

    cd multimodal-ai-assistant
  2. Install dependencies
    Install the required packages by running:

    pip install -r requirements.txt
  3. Download and Install Ollama
    Ollama is required for running models locally. Follow these instructions for your operating system:

    • Windows/Linux/macOS:
      Visit the Ollama Download Page and follow the instructions to install Ollama on your machine.
  4. Download Local LLM

    After installing Ollama, you will need to download the models. Use the following commands:

    • For Llama3.1:
      ollama pull llama3.1
      
    • For Llama3.2:
      ollama pull llama3.2
      
    • For LLava:
      ollama pull llava-llama3
      
    • For CodeOLLama:
      ollama pull codellama:7b
      
    • Per MistralNemo-12b:
      ollama pull mistral-nemo
      
    • Per Gemma-2:
      ollama pull gemma2
      
    • Per Qwen2.5-7b:
      ollama pull qwen2.5:7b
      
    • Per Phi-3.5:
      ollama pull phi3.5
      
  5. Literal

    To create an account on Literal AI, visit the website and click on Sign Up. After confirming your email, log in to the dashboard and create a new project by clicking on Create Project. Finally, navigate to Settings, locate the API Keys section, and click on Generate API Key to obtain your API key.

  6. Configure your environment

    Set up your Chainlit API by running the following command in your terminal:

    chainlit create-secret

    Create a .env file in the root directory and add your previously created API keys:

    CHAINLIT_API_KEY=<YOUR_CHAINLIT_API_KEY>

    LITERAL_API_KEY=<YOUR_LITERAL_API_KEY>

  7. Run the application

    Start the assistant with:

    chainlit run app.py --port <YOUR_PORT>

    This will open the localhost page localhost:<YOUR_PORT>.

Contact Information 📧

For inquiries, feel free to reach out via email:
Email: [email protected]

About

Multimodal AI Assistant is an advanced chatbot designed for local interactions with LLMs, files, and multimodal functionalities. Built on the Chainlit framework, it integrates Langchain and ChatOllama for hosting models locally, with cloud storage and monitoring via LiteralAI.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages