Gradio demo of text-to-image using Stable Diffusion 3.5 Large
Full documentation is available on Hugging Face: Stable Diffusion Text-to-image
Estimated Inference Speed: 7 seconds for Stable Diffusion 3.5 Large on an NVIDIA H100 GPU
-
Open a web browser, log in to Hugging Face and register your name and email, to use stable-diffusion-3.5-large
-
Create a new Hugging Face user access token, which will capture that you completed the registration form
-
Clone this repo to your machine and change into the directory for this demo:
cd ./stability-ai-toolkit/sd35-text-to-image-gradio
-
Set up the app in a Python virtual environment:
python -m venv <your_environment_name> source <your_environment_name>/bin/activate
-
Set your
HF_TOKEN
inside your virtual environmentexport HF_TOKEN=<Hugging Face user access token>
-
Install dependencies
pip install -r requirements.txt
NOTE: Read requirements.txt for MacOS PyTorch installation instructions
TL;DR:
# Inside your virtual environment pip install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu
-
Start the app
python app.py
-
Open UI in a web browser: http://127.0.0.1:7861