21 Oct 12:50

mudler

015835d

v2.22.1 Latest

Latest

What's Changed

Bug fixes 🐛

fix(vllm): images and videos are base64 by default by @mudler in #3867
fix(dependencies): pin pytorch version by @mudler in #3872
fix(dependencies): move deps that brings pytorch by @mudler in #3873
fix(vllm): do not set videos if we don't have any by @mudler in #3885

Exciting New Features 🎉

feat(templates): extract text from multimodal requests by @mudler in #3866
feat(templates): add sprig to multimodal templates by @mudler in #3868

🧠 Models

models(gallery): add llama-3_8b_unaligned_beta by @mudler in #3818
models(gallery): add llama3.1-flammades-70b by @mudler in #3819
models(gallery): add llama3.1-gutenberg-doppel-70b by @mudler in #3820
models(gallery): add llama-3.1-8b-arliai-formax-v1.0-iq-arm-imatrix by @mudler in #3821
models(gallery): add supernova-medius by @mudler in #3822
models(gallery): add hermes-3-llama-3.1-70b-lorablated by @mudler in #3823
models(gallery): add hermes-3-llama-3.1-8b-lorablated by @mudler in #3824
models(gallery): add eva-qwen2.5-14b-v0.1-i1 by @mudler in #3825
models(gallery): add cursorcore-qw2.5-7b-i1 by @mudler in #3826
models(gallery): add cursorcore-qw2.5-1.5b-lc-i1 by @mudler in #3827
models(gallery): add cursorcore-ds-6.7b-i1 by @mudler in #3828
models(gallery): add cursorcore-yi-9b by @mudler in #3829
models(gallery): add edgerunner-command-nested-i1 by @mudler in #3830
models(gallery): add llama-3.2-chibi-3b by @mudler in #3843
models(gallery): add llama-3.2-3b-reasoning-time by @mudler in #3844
models(gallery): add ml-ms-etheris-123b by @mudler in #3845
models(gallery): add doctoraifinetune-3.1-8b-i1 by @mudler in #3846
models(gallery): add astral-fusion-neural-happy-l3.1-8b by @mudler in #3848
models(gallery): add tsunami-0.5x-7b-instruct-i1 by @mudler in #3849
models(gallery): add mahou-1.5-llama3.1-70b-i1 by @mudler in #3850
models(gallery): add llama-3.1-nemotron-70b-instruct-hf by @mudler in #3854
models(gallery): add qevacot-7b-v2 by @mudler in #3855
models(gallery): add l3.1-etherealrainbow-v1.0-rc1-8b by @mudler in #3856
models(gallery): add phi-3.5-mini-titanfusion-0.2 by @mudler in #3857
models(gallery): add mn-lulanum-12b-fix-i1 by @mudler in #3859
models(gallery): add apollo2-9b by @mudler in #3860
models(gallery): add theia-llama-3.1-8b-v1 by @mudler in #3861
models(gallery): add tor-8b by @mudler in #3862
models(gallery): add darkens-8b by @mudler in #3863
models(gallery): add baldur-8b by @mudler in #3864
models(gallery): add meissa-qwen2.5-7b-instruct by @mudler in #3865
models(gallery): add phi-3 vision by @mudler in #3890

👒 Dependencies

chore(deps): Bump docs/themes/hugo-theme-relearn from d5a0ee0 to e1a1f01 by @dependabot in #3798
chore(deps): Bump mxschmitt/action-tmate from 3.18 to 3.19 by @dependabot in #3799
chore(deps): Bump sentence-transformers from 3.1.1 to 3.2.0 in /backend/python/sentencetransformers by @dependabot in #3801
chore(deps): Bump langchain from 0.3.2 to 0.3.3 in /examples/langchain/langchainpy-localai-example by @dependabot in #3803
chore(deps): Bump llama-index from 0.11.16 to 0.11.17 in /examples/langchain-chroma by @dependabot in #3804
chore(deps): Bump python from 3.12-bullseye to 3.13-bullseye in /examples/langchain by @dependabot in #3805
chore(deps): Bump openai from 1.51.1 to 1.51.2 in /examples/functions by @dependabot in #3806
chore(deps): Bump llama-index from 0.11.16 to 0.11.17 in /examples/chainlit by @dependabot in #3807
chore(deps): Bump langchain from 0.3.1 to 0.3.3 in /examples/langchain-chroma by @dependabot in #3809
chore(deps): Bump openai from 1.51.1 to 1.51.2 in /examples/langchain/langchainpy-localai-example by @dependabot in #3808
chore(deps): Bump yarl from 1.13.1 to 1.15.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3816
chore(deps): Bump chromadb from 0.5.11 to 0.5.13 in /examples/langchain-chroma by @dependabot in #3811
chore(deps): Bump langchain from 0.3.2 to 0.3.3 in /examples/functions by @dependabot in #3802
chore(deps): Bump debugpy from 1.8.6 to 1.8.7 in /examples/langchain/langchainpy-localai-example by @dependabot in #3814
chore(deps): Bump aiohttp from 3.10.9 to 3.10.10 in /examples/langchain/langchainpy-localai-example by @dependabot in #3812
chore(deps): Bump openai from 1.51.1 to 1.51.2 in /examples/langchain-chroma by @dependabot in #3810
chore(deps): Bump charset-normalizer from 3.3.2 to 3.4.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3834
chore(deps): Bump langchain-community from 0.3.1 to 0.3.2 in /examples/langchain/langchainpy-localai-example by @dependabot in #3831
chore(deps): Bump yarl from 1.15.1 to 1.15.2 in /examples/langchain/langchainpy-localai-example by @dependabot in #3832
chore(deps): Bump numpy from 2.1.1 to 2.1.2 in /examples/langchain/langchainpy-localai-example by @dependabot in #3833
chore(deps): Bump docs/themes/hugo-theme-relearn from e1a1f01 to 007cc20 by @dependabot in #3835
chore(deps): Bump gradio from 3.48.0 to 5.0.0 in /backend/python/openvoice in the pip group by @dependabot in #3880
chore(deps): bump llama-cpp to cda0e4b648dde8fac162b3430b14a99597d3d74f by @mudler in #3884

Other Changes

docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #3796
chore: dependabot ignore generated grpc go package by @dave-gray101 in #3795
chore: ⬆️ Update ggerganov/llama.cpp to edc265661cd707327297b6ec4d83423c43cb50a5 by @localai-bot in #3797
chore: ⬆️ Update ggerganov/llama.cpp to d4c19c0f5cdb1e512573e8c86c79e8d0238c73c4 by @localai-bot in #3817
chore: ⬆️ Update ggerganov/llama.cpp to a89f75e1b7b90cb2d4d4c52ca53ef9e9b466aa45 by @localai-bot in #3837
chore: ⬆️ Update ggerganov/whisper.cpp to 06a1da9daff94c1bf1b1d38950628264fe443f76 by @localai-bot in #3836
Update integrations.md with LLPhant by @f-lombardo in #3838
fix(llama.cpp): consider also native builds by @mudler in #3839
chore: ⬆️ Update ggerganov/whisper.cpp to b6049060dd2341b7816d2bce7dc7451c1665828e by @localai-bot in #3842
chore: ⬆️ Update ggerganov/llama.cpp to 755a9b2bf00fbae988e03a47e852b66eaddd113a by @localai-bot in #3841
chore(deps): bump grpcio to 1.67.0 by @mudler in #3851
chore: ⬆️ Update ggerganov/llama.cpp to 9e041024481f6b249ab8918e18b9477f873b5a5e by @localai-bot in #3853
chore: ⬆️ Update ggerganov/whisper.cpp to d3f7137cc9befa6d74dc4085de2b664b97b7c8bb by @localai-bot in #3852
fix(mamba): pin torch version by @mudler in #3871
chore: ⬆️ Update ggerganov/llama.cpp to 99bd4ac28c32cd17c0e337ff5601393b033dc5fc by @localai-bot in #3869
chore: ⬆️ Update ggerganov/whisper.cpp to a5abfe6a90495f7bf19fe70d016ecc255e97359c by @localai-bot in #3870
chore(deps): pin packaging by @mudler ...

Contributors

f-lombardo, mudler, and 3 other contributors

Assets 11

12 Oct 13:09

mudler

v2.22.0

a1634b2

v2.22.0

LocalAI v2.22.0 is out 🥳

💡 Highlights

Image-to-Text and Video-to-Text Support: The VLLM backend now supports both image-to-text and video-to-text processing.
Enhanced Multimodal Support: Template placeholders are now available, offering more flexibility in multimodal applications
Model Management Made Easy: List all your loaded models directly via the /system endpoint for seamless management.
Various bugfixes and improvements: Fixed issues with dangling processes to ensure proper resource management and resolved channel closure issues in the base GRPC server.

🖼️ Multimodal vLLM

To use multimodal models with vLLM simply specify the model in the YAML file. Models however can differ if support multiple images or single images, along how they process internally placeholders for images.

Some models/libraries have different way to express images, videos or audio placeholders. For example, llama.cpp backend expects images within an [img-ID] tag, but other backends/models (e.g. vLLM) use a different notation ( <|image_|>).

For example, to override defaults, now it is possible to set in the model configuration the following:

template:
  video: "<|video_{{.ID}}|> {{.Text}}"
  image: "<|image_{{.ID}}|> {{.Text}}"
  audio: "<|audio_{{.ID}}|> {{.Text}}"

📹 Video and Audio understanding

Some libraries might support both Video and Audio. Currently only vLLM supports Video understanding, and can be used in the API by "extending" the OpenAI API with audio and video type along images:

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "What'\''s in this video?"
          },
          {
            "type": "video_url",
            "video_url": {
              "url": "https://video-image-url"
            }
          }
        ]
      }
    ],
    "max_tokens": 300
  }'

🧑‍🏭 Work in progress

Realtime API is work in progress , tracked in #3714. thumbs up if you want to see it supported in LocalAI!

What's Changed

Bug fixes 🐛

chore: simplify model loading by @mudler in #3715
fix(initializer): correctly reap dangling processes by @mudler in #3717
fix(base-grpc): close channel in base grpc server by @mudler in #3734
fix(vllm): bump cmake - vllm requires it by @mudler in #3744
fix(llama-cpp): consistently select fallback by @mudler in #3789
fix(welcome): do not list model twice if we have a config by @mudler in #3790
fix: listmodelservice / welcome endpoint use LOOSE_ONLY by @dave-gray101 in #3791

Exciting New Features 🎉

feat(api): list loaded models in /system by @mudler in #3661
feat: Add Get Token Metrics to GRPC server by @siddimore in #3687
refactor: ListModels Filtering Upgrade by @dave-gray101 in #2773
feat: track internally started models by ID by @mudler in #3693
feat: tokenization endpoint by @shraddhazpy in #3710
feat(multimodal): allow to template placeholders by @mudler in #3728
feat(vllm): add support for image-to-text and video-to-text by @mudler in #3729
feat(shutdown): allow force shutdown of backends by @mudler in #3733
feat(transformers): Use downloaded model for Transformers backend if it already exists. by @joshbtn in #3777
fix: roll out bluemonday Sanitize more widely by @dave-gray101 in #3794

🧠 Models

models(gallery): add llama-3.2 3B and 1B by @mudler in #3671
chore(model-gallery): ⬆️ update checksum by @localai-bot in #3675
models(gallery): add magnusintellectus-12b-v1-i1 by @mudler in #3678
models(gallery): add bigqwen2.5-52b-instruct by @mudler in #3679
feat(api): add correlationID to Track Chat requests by @siddimore in #3668
models(gallery): add replete-llm-v2.5-qwen-14b by @mudler in #3688
models(gallery): add replete-llm-v2.5-qwen-7b by @mudler in #3689
models(gallery): add calme-2.2-qwen2.5-72b-i1 by @mudler in #3691
models(gallery): add salamandra-7b-instruct by @mudler in #3726
models(gallery): add mn-backyardai-party-12b-v1-iq-arm-imatrix by @mudler in #3740
models(gallery): add t.e-8.1-iq-imatrix-request by @mudler in #3741
models(gallery): add violet_twilight-v0.2-iq-imatrix by @mudler in #3742
models(gallery): add gemma-2-9b-it-abliterated by @mudler in #3743
models(gallery): add moe-girl-1ba-7bt-i1 by @mudler in #3766
models(gallery): add archfunctions models by @mudler in #3767
models(gallery): add versatillama-llama-3.2-3b-instruct-abliterated by @mudler in #3771
models(gallery): add llama3.2-3b-enigma by @mudler in #3772
models(gallery): add llama3.2-3b-esper2 by @mudler in #3773
models(gallery): add llama-3.1-swallow-70b-v0.1-i1 by @mudler in #3774
models(gallery): add rombos-llm-v2.5.1-qwen-3b by @mudler in #3778
models(gallery): add qwen2.5-7b-ins-v3 by @mudler in #3779
models(gallery): add dans-personalityengine-v1.0.0-8b by @mudler in #3780
models(gallery): add llama-3.2-3b-agent007 by @mudler in #3781
models(gallery): add nihappy-l3.1-8b-v0.09 by @mudler in #3782
models(gallery): add llama-3.2-3b-agent007-coder by @mudler in #3783
models(gallery): add fireball-meta-llama-3.2-8b-instruct-agent-003-128k-code-dpo by @mudler in #3784
models(gallery): add gemma-2-ataraxy-v3i-9b by @mudler in #3785

📖 Documentation and examples

chore(docs): update CONTRIBUTING.md by @jjasghar in #3723

👒 Dependencies

chore: ⬆️ Update ggerganov/llama.cpp to ea9c32be71b91b42ecc538bd902e93cbb5fb36cb by @localai-bot in #3667
chore: ⬆️ Update ggerganov/whisper.cpp to 69339af2d104802f3f201fd419163defba52890e by @localai-bot in #3666
chore: ⬆️ Update ggerganov/llama.cpp to 95bc82fbc0df6d48cf66c857a4dda3d044f45ca2 by @localai-bot in #3674
chore: ⬆️ Update ggerganov/llama.cpp to b5de3b74a595cbfefab7eeb5a567425c6a9690cf by @localai-bot in #3681
chore: ⬆️ Update ggerganov/whisper.cpp to 8feb375fbdf0277ad36958c218c6bf48fa0ba75a by @localai-bot in #3680
chore: ⬆️ Update ggerganov/llama.cpp to c919d5db39c8a7fcb64737f008e4b105ee0acd20 by @localai-bot in #3686
chore(deps): bump grpcio to 1.66.2 by @mudler in #3690
chore(deps): Bump openai from 1.47.1 to 1.50.2 in /examples/langchain-chroma by @dependabot in #3697
chore(deps): Bump chromadb from 0.5.7 to 0.5.11 in /examples/langchain-chroma by @dependabot in #3696
chore(deps): Bump langchain from 0.3.0 to 0.3.1 in /examples/langchain-chroma by @dependabot in #3694
chore: ⬆️ Update ggerganov/llama.cpp to 6f1d9d71f4c568778a7637ff6582e6f6ba5fb9d3 by @localai-bot in #3708
chore(deps): Bump securego/gosec from 2.21.0 to 2.21.4 by @dependabot in #3698
chore(deps): Bump openai from 1.47.1 to 1.50.2 in /examples/functions by @dependabot in #3699
chore(deps): Bump langchain from 0.3.0 to 0.3.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3704
chore(deps): Bump greenlet from 3.1.0 to 3.1.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3703
chore(deps): Bump langchain from 0.3.0 to 0.3.1 in /examples/functions by @dependabot in #3700
chore(deps): Bump langchain-community from 0.2.16 to 0.3.1 in /examples/langchain/langchainpy-localai-example by ...

Contributors

joshbtn, jjasghar, and 6 other contributors

Assets 11

25 Sep 12:30

mudler

v2.21.1

33b2d38

v2.21.1

What's Changed

Bug fixes 🐛

fix(health): do not require auth for /healthz and /readyz by @mudler in #3656

👒 Dependencies

chore(deps): Bump sentence-transformers from 3.1.0 to 3.1.1 in /backend/python/sentencetransformers by @dependabot in #3651
chore(deps): Bump pydantic from 2.8.2 to 2.9.2 in /examples/langchain/langchainpy-localai-example by @dependabot in #3648
chore(deps): Bump openai from 1.45.1 to 1.47.1 in /examples/functions by @dependabot in #3645
chore: ⬆️ Update ggerganov/llama.cpp to 70392f1f81470607ba3afef04aa56c9f65587664 by @localai-bot in #3659
chore(deps): Bump llama-index from 0.11.7 to 0.11.12 in /examples/langchain-chroma by @dependabot in #3639
chore(deps): Bump openai from 1.45.1 to 1.47.1 in /examples/langchain-chroma by @dependabot in #3641
chore(deps): Bump llama-index from 0.11.9 to 0.11.12 in /examples/chainlit by @dependabot in #3642
chore: ⬆️ Update ggerganov/whisper.cpp to 0d2e2aed80109e8696791083bde3b58e190b7812 by @localai-bot in #3658
chore(deps): Bump chromadb from 0.5.5 to 0.5.7 in /examples/langchain-chroma by @dependabot in #3640

Other Changes

docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #3657

Full Changelog: v2.21.0...v2.21.1

Contributors

mudler, dependabot, and localai-bot

Assets 11

24 Sep 14:22

mudler

v2.21.0

90cacb9

v2.21.0

💡 Highlights!

LocalAI v2.21 release is out!

Deprecation of the exllama backend
AIO images now have gpt-4o instead of gpt-4-vision-preview for Vision API
vLLM backend now supports embeddings
New endpoint to list system information (/system)
trust_remote_code is now respected by sentencetransformers
Auto warm-up and load models on start
coqui backend switched to the community-maintained fork

What's Changed

Breaking Changes 🛠

chore(exllama): drop exllama backend by @mudler in #3536
chore(aio): rename gpt-4-vision-preview to gpt-4o by @mudler in #3597

Exciting New Features 🎉

feat: elevenlabs sound-generation api by @dave-gray101 in #3355
feat(vllm): add support for embeddings by @mudler in #3440
feat: add endpoint to list system informations by @mudler in #3449
feat: extract output with regexes from LLMs by @mudler in #3491
feat: allow setting trust_remote_code for sentencetransformers backend by @Nyralei in #3552
feat(api): allow to pass videos to backends by @mudler in #3601
feat(api): allow to pass audios to backends by @mudler in #3603
feat: auto load into memory on startup by @sozercan in #3627
feat(coqui): switch to maintained community fork by @mudler in #3625

Bug fixes 🐛

fix(p2p): correctly allow to pass extra args to llama.cpp by @mudler in #3368
fix(model-loading): keep track of open GRPC Clients by @mudler in #3377
fix(tts): check error before inspecting result by @mudler in #3415
fix(shutdown): do not shutdown immediately busy backends by @mudler in #3543
fix(parler-tts): fix install with sycl by @mudler in #3624
fix(ci): fixup checksum scanning pipeline by @mudler in #3631
fix(hipblas): do not push all variants to hipblas builds by @mudler in #3630

🧠 Models

chore(model-gallery): add more quants for popular models by @mudler in #3365
models(gallery): add phi-3.5 by @mudler in #3376
models(gallery): add calme-2.1-phi3.5-4b-i1 by @mudler in #3383
models(gallery): add magnum-v3-34b by @mudler in #3384
models(gallery): add phi-3.5-vision by @mudler in #3421
Revert "models(gallery): add phi-3.5-vision" by @mudler in #3422
chore(model-gallery): ⬆️ update checksum by @localai-bot in #3425
feat: Added Piper voice it-paola-medium by @fakezeta in #3434
chore(model-gallery): ⬆️ update checksum by @localai-bot in #3442
models(gallery): add hubble-4b-v1 by @mudler in #3444
chore(model-gallery): ⬆️ update checksum by @localai-bot in #3446
models(gallery): add yi-coder (and variants) by @mudler in #3482
chore(model-gallery): ⬆️ update checksum by @localai-bot in #3486
models(gallery): add reflection-llama-3.1-70b by @mudler in #3487
models(gallery): add athena-codegemma-2-2b-it by @mudler in #3490
models(gallery): add azure_dusk-v0.2-iq-imatrix by @mudler in #3538
models(gallery): add mn-12b-lyra-v4-iq-imatrix by @mudler in #3539
models(gallery): add datagemma models by @mudler in #3540
models(gallery): add l3.1-8b-niitama-v1.1-iq-imatrix by @mudler in #3550
models(gallery): add llama-3.1-8b-stheno-v3.4-iq-imatrix by @mudler in #3551
fix: gallery/index.yaml comment spacing by @dave-gray101 in #3585
models(gallery): add qwen2.5-14b-instruct by @mudler in #3607
models(gallery): add qwen2.5-math-7b-instruct by @mudler in #3609
models(gallery): add qwen2.5-14b_uncencored by @mudler in #3610
models(gallery): add qwen2.5-coder-7b-instruct by @mudler in #3611
models(gallery): add qwen2.5-math-72b-instruct by @mudler in #3612
models(gallery): add qwen2.5-0.5b-instruct, qwen2.5-1.5b-instruct by @mudler in #3613
models(gallery): add qwen2.5 32B, 72B, 32B Instruct by @mudler in #3614
models(gallery): add llama-3.1-supernova-lite-reflection-v1.0-i1 by @mudler in #3615
models(gallery): add llama-3.1-supernova-lite by @mudler in #3616
models(gallery): add llama3.1-8b-shiningvaliant2 by @mudler in #3617
models(gallery): add buddy2 by @mudler in #3618
models(gallery): add llama-3.1-8b-arliai-rpmax-v1.1 by @mudler in #3619
Fix NeuralDaredevil URL by @nyx4ris in #3621
models(gallery): add nightygurps-14b-v1.1 by @mudler in #3633
models(gallery): add gemma-2-9b-arliai-rpmax-v1.1 by @mudler in #3634
models(gallery): add gemma-2-2b-arliai-rpmax-v1.1 by @mudler in #3635
models(gallery): add acolyte-22b-i1 by @mudler in #3636

📖 Documentation and examples

docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #3366
chore(docs): add Vulkan images links by @mudler in #3620

👒 Dependencies

chore: ⬆️ Update ggerganov/llama.cpp to 3ba780e2a8f0ffe13f571b27f0bbf2ca5a199efc by @localai-bot in #3361
chore(deps): Bump openai from 1.41.1 to 1.42.0 in /examples/functions by @dependabot in #3390
chore(deps): Bump docs/themes/hugo-theme-relearn from 82a5e98 to 3a0ae52 by @dependabot in #3391
chore(deps): Bump idna from 3.7 to 3.8 in /examples/langchain/langchainpy-localai-example by @dependabot in #3399
chore(deps): Bump llama-index from 0.10.65 to 0.11.1 in /examples/chainlit by @dependabot in #3404
chore(deps): Bump llama-index from 0.10.67.post1 to 0.11.1 in /examples/langchain-chroma by @dependabot in #3406
chore(deps): Bump marshmallow from 3.21.3 to 3.22.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3400
chore(deps): Bump openai from 1.40.5 to 1.42.0 in /examples/langchain-chroma by @dependabot in #3405
chore(deps): Bump openai from 1.41.1 to 1.42.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3401
chore(deps): update edgevpn to v0.28 by @mudler in #3412
chore(deps): Bump langchain from 0.2.14 to 0.2.15 in /examples/functions by @dependabot in #3453
chore(deps): Bump certifi from 2024.7.4 to 2024.8.30 in /examples/langchain/langchainpy-localai-example by @dependabot in #3457
chore(deps): Bump yarl from 1.9.4 to 1.9.7 in /examples/langchain/langchainpy-localai-example by @dependabot in #3459
chore(deps): Bump langchain-community from 0.2.12 to 0.2.15 in /examples/langchain/langchainpy-localai-example by @dependabot in #3461
chore(deps): Bump llama-index from 0.11.1 to 0.11.4 in /examples/chainlit by @dependabot in #3462
chore(deps): Bump llama-index from 0.11.1 to 0.11.4 in /examples/langchain-chroma by @dependabot in #3467
chore(deps): Bump docs/themes/hugo-theme-relearn from 3a0ae52 to 550a6ee by @dependabot in #3472
chore(deps): Bump openai from 1.42.0 to 1.43.0 in /examples/functions by @dependabot in #3452
chore(deps): Bump langchain from 0.2.14 to 0.2.15 in /examples/langchain/langchainpy-localai-example by @dependabot in #3460
chore(deps): Bump openai from 1.42.0 to 1.43.0 in /examples/langchain-chroma by @dependabot in #3468
chore(deps): Bump langchain from 0.2.14 to 0.2.15 in /examples/langchain-chroma by ...

Contributors

sozercan, mudler, and 7 other contributors

Assets 11

23 Aug 08:29

mudler

v2.20.1

a9c521e

v2.20.1

It's that time again—I’m excited (and honestly, a bit proud) to announce the release of LocalAI v2.20! This one’s a biggie, with some of the most requested features and enhancements, all designed to make your self-hosted AI journey even smoother and more powerful.

TL;DR

🌍 Explorer & Community: Explore global community pools at explorer.localai.io
👀 Demo instance available: Test out LocalAI at demo.localai.io
🤗 Integration: Hugging Face Local apps now include LocalAI
🐛 Bug Fixes: Diffusers and hipblas issues resolved
🎨 New Feature: FLUX-1 image generation support
🏎️ Strict Mode: Stay compliant with OpenAI’s latest API changes
💪 Multiple P2P Clusters: Run multiple clusters within the same network
🧪 Deprecation Notice: gpt4all.cpp and petals backends deprecated

🌍 Explorer and Global Community Pools

Now you can share your LocalAI instance with the global community or explore available instances by visiting explorer.localai.io. This decentralized network powers our demo instance, creating a truly collaborative AI experience.

How It Works

Using the Explorer, you can easily share or connect to clusters. For detailed instructions on creating new clusters or connecting to existing ones, check out our documentation.

👀 Demo Instance Now Available

Curious about what LocalAI can do? Dive right in with our live demo at demo.localai.io! Thanks to our generous sponsors, this instance is publicly available and configured via peer-to-peer (P2P) networks. If you'd like to connect, follow the instructions here.

🤗 Hugging Face Integration

I am excited to announce that LocalAI is now integrated within Hugging Face’s local apps! This means you can select LocalAI directly within Hugging Face to build and deploy models with the power and flexibility of our platform. Experience seamless integration with a single click!

This integration was made possible through this PR.

🎨 FLUX-1 Image Generation Support

FLUX-1 lands in LocalAI! With this update, LocalAI can now generate stunning images using FLUX-1, even in federated mode. Whether you're experimenting with new designs or creating production-quality visuals, FLUX-1 has you covered.

Try it out at demo.localai.io and see what LocalAI + FLUX-1 can do!

🐛 Diffusers and hipblas Fixes

Great news for AMD users! If you’ve encountered issues with the Diffusers backend or hipblas, those bugs have been resolved. We’ve transitioned to uv for managing Python dependencies, ensuring a smoother experience. For more details, check out Issue #1592.

🏎️ Strict Mode for API Compliance

To stay up to date with OpenAI’s latest changes, now LocalAI have support as well for Strict Mode ( https://openai.com/index/introducing-structured-outputs-in-the-api/ ). This new feature ensures compatibility with the most recent API updates, enforcing stricter JSON outputs using BNF grammar rules.

To activate, simply set strict: true in your API calls, even if it’s disabled in your configuration.

Key Notes:

Setting strict: true enables grammar enforcement, even if disabled in your config.
If format_type is set to json_schema, BNF grammars will be automatically generated from the schema.

🛑 Disable Gallery

Need to streamline your setup? You can now disable the gallery endpoint using LOCALAI_DISABLE_GALLERY_ENDPOINT. For more options, check out the full list of commands with --help.

🌞 P2P and Federation Enhancements

Several enhancements have been made to improve your experience with P2P and federated clusters:

Load Balancing by Default: This feature is now enabled by default (disable it with LOCALAI_RANDOM_WORKER if needed).
Target Specific Workers: Directly target workers in federated mode using LOCALAI_TARGET_WORKER.

💪 Run Multiple P2P Clusters in the Same Network

You can now run multiple clusters within the same network by specifying a network ID via CLI. This allows you to logically separate clusters while using the same shared token. Just set LOCALAI_P2P_NETWORK_ID to a UUID that matches across instances.

Please note, while this offers segmentation, it’s not fully secure—anyone with the network token can view available services within the network.

🧪 Deprecation Notice: `gpt4all.cpp` and `petals` Backends

As we continue to evolve, we are officially deprecating the gpt4all.cpp and petals backends. The newer llama.cpp offers a superior set of features and better performance, making it the preferred choice moving forward.

From this release onward, gpt4all models in ggml format are no longer compatible. Additionally, the petals backend has been deprecated as well. LocalAI’s new P2P capabilities now offer a comprehensive replacement for these features.

What's Changed

Breaking Changes 🛠

chore: drop gpt4all.cpp by @mudler in #3106
chore: drop petals by @mudler in #3316

Bug fixes 🐛

fix(ui): do not show duplicate entries if not installed by gallery by @mudler in #3107
fix: be consistent in downloading files, check for scanner errors by @mudler in #3108
fix: ensure correct version of torch is always installed based on BUI… by @cryptk in #2890
fix(python): move accelerate and GPU-specific libs to build-type by @mudler in #3194
fix(apple): disable BUILD_TYPE metal on fallback by @mudler in #3199
fix(vall-e-x): pin hipblas deps by @mudler in #3201
fix(diffusers): use nightly rocm for hipblas builds by @mudler in #3202
fix(explorer): reset counter when network is active by @mudler in #3213
fix(p2p): allocate tunnels only when needed by @mudler in #3259
fix(gallery): be consistent and disable UI routes as well by @mudler in #3262
fix(parler-tts): bump and require after build type deps by @mudler in #3272
fix: add llvm to extra images by @mudler in #3321
fix(p2p): re-use p2p host when running federated mode by @mudler in #3341
fix(ci): pin to llvmlite 0.43 by @mudler in #3342
fix(p2p): avoid starting the node twice by @mudler in #3349
fix(chat): re-generated uuid, created, and text on each request by @mudler in #3359

Exciting New Features 🎉

feat(guesser): add gemma2 by @sozercan in #3118
feat(venv): shared env by @mudler in #3195
feat(openai): add json_schema format type and strict mode by @mudler in #3193
feat(p2p): allow to run multiple clusters in the same p2p network by @mudler in #3128
feat(p2p): add network explorer and community pools by @mudler in #3125
feat(explorer): relax token deletion with error threshold by @mudler in #3211
feat(diffusers): support flux models by @mudler in #3129
feat(explorer): make possible to run sync in a separate process by @mudler in #3224
feat(federated): allow to pickup a specific worker, improve loadbalancing by @mudler in #3243
feat: Initial Version of vscode DevContainer by @dave-gray101 in #3217
feat(explorer): visual improvements by @mudler in #3247
feat(gallery): lazy load images by @mudler in #3246
chore(explorer): add join instructions by @mudler in #3255
chore: allow to disable gallery endpoints, improve p2p connection handling by @mudler in #3256
chore(ux): add animated header with anime.js in p2p sections by @mudler in #3271
chore(p2p): make commands easier to copy-paste by @mudler in #3273
chore(ux): ...

Contributors

cryptk, sozercan, and 5 other contributors

Assets 11

1 Join discussion

22 Aug 20:48

mudler

v2.20.0

fbaae85

v2.20.0

TL;DR

🌍 Explorer & Community: Explore global community pools at explorer.localai.io
👀 Demo instance available: Test out LocalAI at demo.localai.io
🤗 Integration: Hugging Face Local apps now include LocalAI
🐛 Bug Fixes: Diffusers and hipblas issues resolved
🎨 New Feature: FLUX-1 image generation support
🏎️ Strict Mode: Stay compliant with OpenAI’s latest API changes
💪 Multiple P2P Clusters: Run multiple clusters within the same network
🧪 Deprecation Notice: gpt4all.cpp and petals backends deprecated

🌍 Explorer and Global Community Pools

How It Works

Using the Explorer, you can easily share or connect to clusters. For detailed instructions on creating new clusters or connecting to existing ones, check out our documentation.

👀 Demo Instance Now Available

🤗 Hugging Face Integration

This integration was made possible through this PR.

🎨 FLUX-1 Image Generation Support

Try it out at demo.localai.io and see what LocalAI + FLUX-1 can do!

🐛 Diffusers and hipblas Fixes

🏎️ Strict Mode for API Compliance

To activate, simply set strict: true in your API calls, even if it’s disabled in your configuration.

Key Notes:

Setting strict: true enables grammar enforcement, even if disabled in your config.
If format_type is set to json_schema, BNF grammars will be automatically generated from the schema.

🛑 Disable Gallery

Need to streamline your setup? You can now disable the gallery endpoint using LOCALAI_DISABLE_GALLERY_ENDPOINT. For more options, check out the full list of commands with --help.

🌞 P2P and Federation Enhancements

Several enhancements have been made to improve your experience with P2P and federated clusters:

Load Balancing by Default: This feature is now enabled by default (disable it with LOCALAI_RANDOM_WORKER if needed).
Target Specific Workers: Directly target workers in federated mode using LOCALAI_TARGET_WORKER.

💪 Run Multiple P2P Clusters in the Same Network

Please note, while this offers segmentation, it’s not fully secure—anyone with the network token can view available services within the network.

🧪 Deprecation Notice: `gpt4all.cpp` and `petals` Backends

What's Changed

Breaking Changes 🛠

chore: drop gpt4all.cpp by @mudler in #3106
chore: drop petals by @mudler in #3316

Bug fixes 🐛

fix(ui): do not show duplicate entries if not installed by gallery by @mudler in #3107
fix: be consistent in downloading files, check for scanner errors by @mudler in #3108
fix: ensure correct version of torch is always installed based on BUI… by @cryptk in #2890
fix(python): move accelerate and GPU-specific libs to build-type by @mudler in #3194
fix(apple): disable BUILD_TYPE metal on fallback by @mudler in #3199
fix(vall-e-x): pin hipblas deps by @mudler in #3201
fix(diffusers): use nightly rocm for hipblas builds by @mudler in #3202
fix(explorer): reset counter when network is active by @mudler in #3213
fix(p2p): allocate tunnels only when needed by @mudler in #3259
fix(gallery): be consistent and disable UI routes as well by @mudler in #3262
fix(parler-tts): bump and require after build type deps by @mudler in #3272
fix: add llvm to extra images by @mudler in #3321
fix(p2p): re-use p2p host when running federated mode by @mudler in #3341
fix(ci): pin to llvmlite 0.43 by @mudler in #3342
fix(p2p): avoid starting the node twice by @mudler in #3349
fix(chat): re-generated uuid, created, and text on each request by @mudler in #3359

Exciting New Features 🎉

feat(guesser): add gemma2 by @sozercan in #3118
feat(venv): shared env by @mudler in #3195
feat(openai): add json_schema format type and strict mode by @mudler in #3193
feat(p2p): allow to run multiple clusters in the same p2p network by @mudler in #3128
feat(p2p): add network explorer and community pools by @mudler in #3125
feat(explorer): relax token deletion with error threshold by @mudler in #3211
feat(diffusers): support flux models by @mudler in #3129
feat(explorer): make possible to run sync in a separate process by @mudler in #3224
feat(federated): allow to pickup a specific worker, improve loadbalancing by @mudler in #3243
feat: Initial Version of vscode DevContainer by @dave-gray101 in #3217
feat(explorer): visual improvements by @mudler in #3247
feat(gallery): lazy load images by @mudler in #3246
chore(explorer): add join instructions by @mudler in #3255
chore: allow to disable gallery endpoints, improve p2p connection handling by @mudler in #3256
chore(ux): add animated header with anime.js in p2p sections by @mudler in #3271
chore(p2p): make commands easier to copy-paste by @mudler in #3273
chore(ux): allow to create and drag dots in the animation by @mudler in #3287
feat(federation): do not allocate local services for load balancing by @mudler in #3337
feat(p2p): allow to set intervals b...

Contributors

cryptk, sozercan, and 5 other contributors

Assets 11

01 Aug 07:07

mudler

v2.19.4

af05458

v2.19.4

What's Changed

🧠 Models

chore(model-gallery): ⬆️ update checksum by @localai-bot in #3040
chore(model-gallery): ⬆️ update checksum by @localai-bot in #3043
models(gallery): add magnum-32b-v1 by @mudler in #3044
models(gallery): add lumimaid-v0.2-70b-i1 by @mudler in #3045
models(gallery): add sekhmet_aleph-l3.1-8b-v0.1-i1 by @mudler in #3046
models(gallery): add l3.1-8b-llamoutcast-i1 by @mudler in #3047
models(gallery): add l3.1-8b-celeste-v1.5 by @mudler in #3080
models(gallery): add llama-guard-3-8b by @mudler in #3082
models(gallery): add meta-llama-3-instruct-8.9b-brainstorm-5x-form-11 by @mudler in #3083
models(gallery): add sunfall-simpo by @mudler in #3088
models(gallery): add genius-llama3.1-i1 by @mudler in #3089
models(gallery): add seeker-9b by @mudler in #3090
models(gallery): add llama3.1-chinese-chat by @mudler in #3091
models(gallery): add gemmasutra-pro-27b-v1 by @mudler in #3092
models(gallery): add leetwizard by @mudler in #3093
models(gallery): add tarnished-9b-i1 by @mudler in #3096
models(gallery): add meta-llama-3-instruct-12.2b-brainstorm-20x-form-8 by @mudler in #3097
models(gallery): add loki-base-i1 by @mudler in #3098
models(gallery): add tifa by @mudler in #3099

👒 Dependencies

chore(deps): Bump langchain from 0.2.10 to 0.2.11 in /examples/langchain/langchainpy-localai-example by @dependabot in #3053
chore(deps): Bump openai from 1.37.0 to 1.37.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3051
chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python/autogptq by @dependabot in #3048
chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python/vllm by @dependabot in #3061
chore(deps): Bump chromadb from 0.5.4 to 0.5.5 in /examples/langchain-chroma by @dependabot in #3060
chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python/parler-tts by @dependabot in #3062
chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python/rerankers by @dependabot in #3067
chore(deps): Bump setuptools from 69.5.1 to 72.1.0 in /backend/python/transformers-musicgen by @dependabot in #3066
chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python/coqui by @dependabot in #3068
chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python/vall-e-x by @dependabot in #3069
chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python/petals by @dependabot in #3070
chore(deps): Bump setuptools from 69.5.1 to 72.1.0 in /backend/python/transformers by @dependabot in #3071
chore(deps): Bump streamlit from 1.36.0 to 1.37.0 in /examples/streamlit-bot by @dependabot in #3072

Other Changes

docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #3039
fix: install.sh bash specific equality check by @dave-gray101 in #3038
chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #3075
Revert "chore(deps): Bump setuptools from 69.5.1 to 72.1.0 in /backend/python/transformers-musicgen" by @mudler in #3077
Revert "chore(deps): Bump setuptools from 69.5.1 to 72.1.0 in /backend/python/transformers" by @mudler in #3078
Revert "chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python/vllm" by @mudler in #3079
fix(llama-cpp): do not compress with UPX by @mudler in #3084
fix(ci): update openvoice checkpoints URLs by @mudler in #3085
chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #3086
chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #3102

Full Changelog: v2.19.3...v2.19.4

Contributors

mudler, dave-gray101, and 2 other contributors

Assets 11

28 Jul 12:24

mudler

v2.19.3

86f8d5b

v2.19.3

What's Changed

Bug fixes 🐛

fix(gallery): do not attempt to delete duplicate files by @mudler in #3031
fix(gallery): do clear out errors once displayed by @mudler in #3033

Exciting New Features 🎉

feat(grammar): add llama3.1 schema by @mudler in #3015

🧠 Models

models(gallery): add llama3.1-claude by @mudler in #3005
models(gallery): add darkidol llama3.1 by @mudler in #3008
models(gallery): add gemmoy by @mudler in #3009
chore: add function calling template for llama 3.1 models by @mudler in #3010
chore: models(gallery): ⬆️ update checksum by @localai-bot in #3013
models(gallery): add mistral-nemo by @mudler in #3019
models(gallery): add llama3.1-8b-fireplace2 by @mudler in #3018
models(gallery): add lumimaid-v0.2-12b by @mudler in #3020
models(gallery): add darkidol-llama-3.1-8b-instruct-1.1-uncensored-iq… by @mudler in #3021
models(gallery): add meta-llama-3.1-8b-instruct-abliterated by @mudler in #3022
models(gallery): add llama-3.1-70b-japanese-instruct-2407 by @mudler in #3023
models(gallery): add llama-3.1-8b-instruct-fei-v1-uncensored by @mudler in #3024
models(gallery): add openbuddy-llama3.1-8b-v22.1-131k by @mudler in #3025
models(gallery): add lumimaid-8b by @mudler in #3026
models(gallery): add llama3 with enforced functioncall with grammars by @mudler in #3027
chore(model-gallery): ⬆️ update checksum by @localai-bot in #3036

👒 Dependencies

chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #3003
chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #3012
chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #3016
chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #3030
chore: ⬆️ Update ggerganov/whisper.cpp by @localai-bot in #3029
chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #3034

Other Changes

docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #3002
refactor: break down json grammar parser in different files by @mudler in #3004
fix: PR title tag for checksum checker script workflow by @dave-gray101 in #3014

Full Changelog: v2.19.2...v2.19.3

Contributors

mudler, dave-gray101, and localai-bot

Assets 11

24 Jul 17:08

mudler

v2.19.2

80ae919

v2.19.2

This release is a patch release to fix well known issues from 2.19.x

What's Changed

Bug fixes 🐛

fix: pin setuptools 69.5.1 by @fakezeta in #2949
fix(cuda): downgrade to 12.0 to increase compatibility range by @mudler in #2994
fix(llama.cpp): do not set anymore lora_base by @mudler in #2999

Exciting New Features 🎉

ci(Makefile): reduce binary size by compressing by @mudler in #2947
feat(p2p): warn the user to start with --p2p by @mudler in #2993

🧠 Models

models(gallery): add tulu 8b and 70b by @mudler in #2931
models(gallery): add suzume-orpo by @mudler in #2932
models(gallery): add archangel_sft_pythia2-8b by @mudler in #2933
models(gallery): add celestev1.2 by @mudler in #2937
models(gallery): add calme-2.3-phi3-4b by @mudler in #2939
models(gallery): add calme-2.8-qwen2-7b by @mudler in #2940
models(gallery): add StellarDong-72b by @mudler in #2941
models(gallery): add calme-2.4-llama3-70b by @mudler in #2942
models(gallery): add llama3.1 70b and 8b by @mudler in #3000

📖 Documentation and examples

docs: add federation by @mudler in #2929
docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #2935

👒 Dependencies

chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2936
chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2943
chore(deps): Bump grpcio from 1.64.1 to 1.65.1 in /backend/python/openvoice by @dependabot in #2956
chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/sentencetransformers by @dependabot in #2955
chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/bark by @dependabot in #2951
chore(deps): Bump docs/themes/hugo-theme-relearn from 1b2e139 to 7aec99b by @dependabot in #2952
chore(deps): Bump langchain from 0.2.8 to 0.2.10 in /examples/langchain/langchainpy-localai-example by @dependabot in #2959
chore(deps): Bump numpy from 1.26.4 to 2.0.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #2958
chore(deps): Bump sqlalchemy from 2.0.30 to 2.0.31 in /examples/langchain/langchainpy-localai-example by @dependabot in #2957
chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/vllm by @dependabot in #2964
chore(deps): Bump llama-index from 0.10.55 to 0.10.56 in /examples/chainlit by @dependabot in #2966
chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/common/template by @dependabot in #2963
chore(deps): Bump weaviate-client from 4.6.5 to 4.6.7 in /examples/chainlit by @dependabot in #2965
chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/transformers by @dependabot in #2970
chore(deps): Bump openai from 1.35.13 to 1.37.0 in /examples/functions by @dependabot in #2973
chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/diffusers by @dependabot in #2969
chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/exllama2 by @dependabot in #2971
chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/rerankers by @dependabot in #2974
chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/coqui by @dependabot in #2980
chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/parler-tts by @dependabot in #2982
chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/vall-e-x by @dependabot in #2981
chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/transformers-musicgen by @dependabot in #2990
chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/autogptq by @dependabot in #2984
chore(deps): Bump llama-index from 0.10.55 to 0.10.56 in /examples/langchain-chroma by @dependabot in #2986
chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/mamba by @dependabot in #2989
chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2992
chore(deps): Bump langchain-community from 0.2.7 to 0.2.9 in /examples/langchain/langchainpy-localai-example by @dependabot in #2960
chore(deps): Bump openai from 1.35.13 to 1.37.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #2961
chore(deps): Bump langchain from 0.2.8 to 0.2.10 in /examples/functions by @dependabot in #2975
chore(deps): Bump openai from 1.35.13 to 1.37.0 in /examples/langchain-chroma by @dependabot in #2988
chore(deps): Bump langchain from 0.2.8 to 0.2.10 in /examples/langchain-chroma by @dependabot in #2987
chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2995

Other Changes

ci(Makefile): enable p2p on cross-arm64 builds by @mudler in #2928

Full Changelog: v2.19.1...v2.19.2

Contributors

mudler, fakezeta, and 2 other contributors

Assets 11

20 Jul 07:16

mudler

v2.19.1

f9f8379

v2.19.1

LocalAI 2.19.1 is out! 📣

TLDR; Summary spotlight

🖧 Federated Instances via P2P: LocalAI now supports federated instances with P2P, offering both load-balanced and non-load-balanced options.
🎛️ P2P Dashboard: A new dashboard to guide and assist in setting up P2P instances with auto-discovery using shared tokens.
🔊 TTS Integration: Text-to-Speech (TTS) is now included in the binary releases.
🛠️ Enhanced Installer: The installer script now supports setting up federated instances.
📥 Model Pulling: Models can now be pulled directly via URL.
🖼️ WebUI Enhancements: Visual improvements and cleanups to the WebUI and model lists.
🧠 llama-cpp Backend: The llama-cpp (grpc) backend now supports embedding ( https://localai.io/features/embeddings/#llamacpp-embeddings )
⚙️ Tool Support: Small enhancements to tools with disabled grammars.

🖧 LocalAI Federation and AI swarms

LocalAI is revolutionizing the future of distributed AI workloads by making it simpler and more accessible. No more complex setups, Docker or Kubernetes configurations – LocalAI allows you to create your own AI cluster with minimal friction. By auto-discovering and sharing work or weights of the LLM model across your existing devices, LocalAI aims to scale both horizontally and vertically with ease.

How it works?

Starting LocalAI with --p2p generates a shared token for connecting multiple instances: and that's all you need to create AI clusters, eliminating the need for intricate network setups. Simply navigate to the "Swarm" section in the WebUI and follow the on-screen instructions.

For fully shared instances, initiate LocalAI with --p2p --federated and adhere to the Swarm section's guidance. This feature, while still experimental, offers a tech preview quality experience.

Federated LocalAI

Launch multiple LocalAI instances and cluster them together to share requests across the cluster. The "Swarm" tab in the WebUI provides one-liner instructions on connecting various LocalAI instances using a shared token. Instances will auto-discover each other, even across different networks.

Check out a demonstration video: Watch now

LocalAI P2P Workers

Distribute weights across nodes by starting multiple LocalAI workers, currently available only on the llama.cpp backend, with plans to expand to other backends soon.

Check out a demonstration video: Watch now

What's Changed

Bug fixes 🐛

fix: make sure the GNUMake jobserver is passed to cmake for the llama.cpp build by @cryptk in #2697
Using exec when starting a backend instead of spawning a new process by @a17t in #2720
fix(cuda): downgrade default version from 12.5 to 12.4 by @mudler in #2707
fix: Lora loading by @vaaale in #2893
fix: short-circuit when nodes aren't detected by @mudler in #2909
fix: do not list txt files as potential models by @mudler in #2910

🖧 P2P area

feat(p2p): Federation and AI swarms by @mudler in #2723
feat(p2p): allow to disable DHT and use only LAN by @mudler in #2751

Exciting New Features 🎉

Allows to remove a backend from the list by @mauromorales in #2721
ci(Makefile): adds tts in binary releases by @mudler in #2695
feat: HF /scan endpoint by @dave-gray101 in #2566
feat(model-list): be consistent, skip known files from listing by @mudler in #2760
feat(models): pull models from urls by @mudler in #2750
feat(webui): show also models without a config in the welcome page by @mudler in #2772
feat(install.sh): support federated install by @mudler in #2752
feat(llama.cpp): support embeddings endpoints by @mudler in #2871
feat(functions): parse broken JSON when we parse the raw results, use dynamic rules for grammar keys by @mudler in #2912
feat(federation): add load balanced option by @mudler in #2915

🧠 Models

models(gallery): ⬆️ update checksum by @localai-bot in #2701
models(gallery): add l3-8b-everything-cot by @mudler in #2705
models(gallery): add hercules-5.0-qwen2-7b by @mudler in #2708
models(gallery): add llama3-8b-darkidol-2.2-uncensored-1048k-iq-imatrix by @mudler in #2710
models(gallery): add llama-3-llamilitary by @mudler in #2711
models(gallery): add tess-v2.5-gemma-2-27b-alpha by @mudler in #2712
models(gallery): add arcee-agent by @mudler in #2713
models(gallery): add gemma2-daybreak by @mudler in #2714
models(gallery): add L3-Stheno-Maid-Blackroot-Grand-HORROR-16B-GGUF by @mudler in #2715
models(gallery): add qwen2-7b-instruct-v0.8 by @mudler in #2717
models(gallery): add internlm2_5-7b-chat-1m by @mudler in #2719
models(gallery): add gemma-2-9b-it-sppo-iter3 by @mudler in #2722
models(gallery): add llama-3_8b_unaligned_alpha by @mudler in #2727
models(gallery): add l3-8b-lunaris-v1 by @mudler in #2729
models(gallery): add llama-3_8b_unaligned_alpha_rp_soup-i1 by @mudler in #2734
models(gallery): add hathor_respawn-l3-8b-v0.8 by @mudler in #2738
models(gallery): add llama3-8b-instruct-replete-adapted by @mudler in #2739
models(gallery): add llama-3-perky-pat-instruct-8b by @mudler in #2740
models(gallery): add l3-uncen-merger-omelette-rp-v0.2-8b by @mudler in #2741
models(gallery): add nymph_8b-i1 by @mudler in #2742
models(gallery): add smegmma-9b-v1 by @mudler in #2743
models(gallery): add hathor_tahsin-l3-8b-v0.85 by @mudler in #2762
models(gallery): add replete-coder-instruct-8b-merged by @mudler in #2782
models(gallery): add arliai-llama-3-8b-formax-v1.0 by @mudler in #2783
models(gallery): add smegmma-deluxe-9b-v1 by @mudler in #2784
models(gallery): add l3-ms-astoria-8b by @mudler in #2785
models(gallery): add halomaidrp-v1.33-15b-l3-i1 by @mudler in #2786
models(gallery): add llama-3-patronus-lynx-70b-instruct by @mudler in #2788
models(gallery): add llamax3 by @mudler in #2849
models(gallery): add arliai-llama-3-8b-dolfin-v0.5 by @mudler in #2852
models(gallery): add tiger-gemma-9b-v1-i1 by @mudler in #2853
feat: models(gallery): add deepseek-v2-lite by @mudler in #2658
models(gallery): ⬆️ update checksum by @localai-bot in #2860
models(gallery): add phi-3.1-mini-4k-instruct by @mudler in #2863
models(gallery): ⬆️ update checksum by @localai-bot in #2887
models(gallery): add ezo model series (llama3, gemma) by @mudler in #2891
models(gallery): add l3-8b-niitama-v1 by @mudler in #2895
models(gallery): add mathstral-7b-v0.1-imat by @mudler in #2901
models(gallery): add MythicalMaid/EtherealMaid 15b by @mudler in #2902
models(gallery): add flammenai/Mahou-1.3d-mistral-7B by @mudler in #2903
models(gallery): add big-tiger-gemma-27b-v1 by @mudler in #2918
models(gallery): add phillama-3.8b-v0.1 by @mudler in #2920
models(gallery): add qwen2-wukong-7b by @mudler in #2921
models(gallery): add einstein-v4-7b by @mudler in #2922
models(gallery): add gemma-2b-translation-v0.150 by @mudler in #2923
models(gallery)...

Contributors

cryptk, mauromorales, and 9 other contributors

Assets 11

Releases: mudler/LocalAI

v2.22.1

What's Changed

Bug fixes 🐛

Exciting New Features 🎉

🧠 Models

👒 Dependencies

Other Changes

Contributors

v2.22.0

LocalAI v2.22.0 is out 🥳

💡 Highlights

🖼️ Multimodal vLLM

📹 Video and Audio understanding

🧑‍🏭 Work in progress

What's Changed

Bug fixes 🐛

Exciting New Features 🎉

🧠 Models

📖 Documentation and examples

👒 Dependencies

Contributors

v2.21.1

What's Changed

Bug fixes 🐛

👒 Dependencies

Other Changes

Contributors

v2.21.0

💡 Highlights!

What's Changed

Breaking Changes 🛠

Exciting New Features 🎉

Bug fixes 🐛

🧠 Models

📖 Documentation and examples

👒 Dependencies

Contributors

v2.20.1

TL;DR

🌍 Explorer and Global Community Pools

How It Works

👀 Demo Instance Now Available

🤗 Hugging Face Integration

🎨 FLUX-1 Image Generation Support

🐛 Diffusers and hipblas Fixes

🏎️ Strict Mode for API Compliance

Key Notes:

🛑 Disable Gallery

🌞 P2P and Federation Enhancements

💪 Run Multiple P2P Clusters in the Same Network

🧪 Deprecation Notice: gpt4all.cpp and petals Backends

What's Changed

Breaking Changes 🛠

Bug fixes 🐛

Exciting New Features 🎉

Contributors

v2.20.0

TL;DR

🌍 Explorer and Global Community Pools

How It Works

👀 Demo Instance Now Available

🤗 Hugging Face Integration

🎨 FLUX-1 Image Generation Support

🐛 Diffusers and hipblas Fixes

🏎️ Strict Mode for API Compliance

Key Notes:

🛑 Disable Gallery

🌞 P2P and Federation Enhancements

💪 Run Multiple P2P Clusters in the Same Network

🧪 Deprecation Notice: gpt4all.cpp and petals Backends

What's Changed

Breaking Changes 🛠

Bug fixes 🐛

Exciting New Features 🎉

Contributors

v2.19.4

What's Changed

🧠 Models

👒 Dependencies

🧪 Deprecation Notice: `gpt4all.cpp` and `petals` Backends

🧪 Deprecation Notice: `gpt4all.cpp` and `petals` Backends