Releases: mudler/LocalAI
v2.22.1
What's Changed
Bug fixes 🐛
- fix(vllm): images and videos are base64 by default by @mudler in #3867
- fix(dependencies): pin pytorch version by @mudler in #3872
- fix(dependencies): move deps that brings pytorch by @mudler in #3873
- fix(vllm): do not set videos if we don't have any by @mudler in #3885
Exciting New Features 🎉
- feat(templates): extract text from multimodal requests by @mudler in #3866
- feat(templates): add sprig to multimodal templates by @mudler in #3868
🧠 Models
- models(gallery): add llama-3_8b_unaligned_beta by @mudler in #3818
- models(gallery): add llama3.1-flammades-70b by @mudler in #3819
- models(gallery): add llama3.1-gutenberg-doppel-70b by @mudler in #3820
- models(gallery): add llama-3.1-8b-arliai-formax-v1.0-iq-arm-imatrix by @mudler in #3821
- models(gallery): add supernova-medius by @mudler in #3822
- models(gallery): add hermes-3-llama-3.1-70b-lorablated by @mudler in #3823
- models(gallery): add hermes-3-llama-3.1-8b-lorablated by @mudler in #3824
- models(gallery): add eva-qwen2.5-14b-v0.1-i1 by @mudler in #3825
- models(gallery): add cursorcore-qw2.5-7b-i1 by @mudler in #3826
- models(gallery): add cursorcore-qw2.5-1.5b-lc-i1 by @mudler in #3827
- models(gallery): add cursorcore-ds-6.7b-i1 by @mudler in #3828
- models(gallery): add cursorcore-yi-9b by @mudler in #3829
- models(gallery): add edgerunner-command-nested-i1 by @mudler in #3830
- models(gallery): add llama-3.2-chibi-3b by @mudler in #3843
- models(gallery): add llama-3.2-3b-reasoning-time by @mudler in #3844
- models(gallery): add ml-ms-etheris-123b by @mudler in #3845
- models(gallery): add doctoraifinetune-3.1-8b-i1 by @mudler in #3846
- models(gallery): add astral-fusion-neural-happy-l3.1-8b by @mudler in #3848
- models(gallery): add tsunami-0.5x-7b-instruct-i1 by @mudler in #3849
- models(gallery): add mahou-1.5-llama3.1-70b-i1 by @mudler in #3850
- models(gallery): add llama-3.1-nemotron-70b-instruct-hf by @mudler in #3854
- models(gallery): add qevacot-7b-v2 by @mudler in #3855
- models(gallery): add l3.1-etherealrainbow-v1.0-rc1-8b by @mudler in #3856
- models(gallery): add phi-3.5-mini-titanfusion-0.2 by @mudler in #3857
- models(gallery): add mn-lulanum-12b-fix-i1 by @mudler in #3859
- models(gallery): add apollo2-9b by @mudler in #3860
- models(gallery): add theia-llama-3.1-8b-v1 by @mudler in #3861
- models(gallery): add tor-8b by @mudler in #3862
- models(gallery): add darkens-8b by @mudler in #3863
- models(gallery): add baldur-8b by @mudler in #3864
- models(gallery): add meissa-qwen2.5-7b-instruct by @mudler in #3865
- models(gallery): add phi-3 vision by @mudler in #3890
👒 Dependencies
- chore(deps): Bump docs/themes/hugo-theme-relearn from
d5a0ee0
toe1a1f01
by @dependabot in #3798 - chore(deps): Bump mxschmitt/action-tmate from 3.18 to 3.19 by @dependabot in #3799
- chore(deps): Bump sentence-transformers from 3.1.1 to 3.2.0 in /backend/python/sentencetransformers by @dependabot in #3801
- chore(deps): Bump langchain from 0.3.2 to 0.3.3 in /examples/langchain/langchainpy-localai-example by @dependabot in #3803
- chore(deps): Bump llama-index from 0.11.16 to 0.11.17 in /examples/langchain-chroma by @dependabot in #3804
- chore(deps): Bump python from 3.12-bullseye to 3.13-bullseye in /examples/langchain by @dependabot in #3805
- chore(deps): Bump openai from 1.51.1 to 1.51.2 in /examples/functions by @dependabot in #3806
- chore(deps): Bump llama-index from 0.11.16 to 0.11.17 in /examples/chainlit by @dependabot in #3807
- chore(deps): Bump langchain from 0.3.1 to 0.3.3 in /examples/langchain-chroma by @dependabot in #3809
- chore(deps): Bump openai from 1.51.1 to 1.51.2 in /examples/langchain/langchainpy-localai-example by @dependabot in #3808
- chore(deps): Bump yarl from 1.13.1 to 1.15.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3816
- chore(deps): Bump chromadb from 0.5.11 to 0.5.13 in /examples/langchain-chroma by @dependabot in #3811
- chore(deps): Bump langchain from 0.3.2 to 0.3.3 in /examples/functions by @dependabot in #3802
- chore(deps): Bump debugpy from 1.8.6 to 1.8.7 in /examples/langchain/langchainpy-localai-example by @dependabot in #3814
- chore(deps): Bump aiohttp from 3.10.9 to 3.10.10 in /examples/langchain/langchainpy-localai-example by @dependabot in #3812
- chore(deps): Bump openai from 1.51.1 to 1.51.2 in /examples/langchain-chroma by @dependabot in #3810
- chore(deps): Bump charset-normalizer from 3.3.2 to 3.4.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3834
- chore(deps): Bump langchain-community from 0.3.1 to 0.3.2 in /examples/langchain/langchainpy-localai-example by @dependabot in #3831
- chore(deps): Bump yarl from 1.15.1 to 1.15.2 in /examples/langchain/langchainpy-localai-example by @dependabot in #3832
- chore(deps): Bump numpy from 2.1.1 to 2.1.2 in /examples/langchain/langchainpy-localai-example by @dependabot in #3833
- chore(deps): Bump docs/themes/hugo-theme-relearn from
e1a1f01
to007cc20
by @dependabot in #3835 - chore(deps): Bump gradio from 3.48.0 to 5.0.0 in /backend/python/openvoice in the pip group by @dependabot in #3880
- chore(deps): bump llama-cpp to cda0e4b648dde8fac162b3430b14a99597d3d74f by @mudler in #3884
Other Changes
- docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #3796
- chore: dependabot ignore generated grpc go package by @dave-gray101 in #3795
- chore: ⬆️ Update ggerganov/llama.cpp to
edc265661cd707327297b6ec4d83423c43cb50a5
by @localai-bot in #3797 - chore: ⬆️ Update ggerganov/llama.cpp to
d4c19c0f5cdb1e512573e8c86c79e8d0238c73c4
by @localai-bot in #3817 - chore: ⬆️ Update ggerganov/llama.cpp to
a89f75e1b7b90cb2d4d4c52ca53ef9e9b466aa45
by @localai-bot in #3837 - chore: ⬆️ Update ggerganov/whisper.cpp to
06a1da9daff94c1bf1b1d38950628264fe443f76
by @localai-bot in #3836 - Update integrations.md with LLPhant by @f-lombardo in #3838
- fix(llama.cpp): consider also native builds by @mudler in #3839
- chore: ⬆️ Update ggerganov/whisper.cpp to
b6049060dd2341b7816d2bce7dc7451c1665828e
by @localai-bot in #3842 - chore: ⬆️ Update ggerganov/llama.cpp to
755a9b2bf00fbae988e03a47e852b66eaddd113a
by @localai-bot in #3841 - chore(deps): bump grpcio to 1.67.0 by @mudler in #3851
- chore: ⬆️ Update ggerganov/llama.cpp to
9e041024481f6b249ab8918e18b9477f873b5a5e
by @localai-bot in #3853 - chore: ⬆️ Update ggerganov/whisper.cpp to
d3f7137cc9befa6d74dc4085de2b664b97b7c8bb
by @localai-bot in #3852 - fix(mamba): pin torch version by @mudler in #3871
- chore: ⬆️ Update ggerganov/llama.cpp to
99bd4ac28c32cd17c0e337ff5601393b033dc5fc
by @localai-bot in #3869 - chore: ⬆️ Update ggerganov/whisper.cpp to
a5abfe6a90495f7bf19fe70d016ecc255e97359c
by @localai-bot in #3870 - chore(deps): pin packaging by @mudler ...
v2.22.0
LocalAI v2.22.0 is out 🥳
💡 Highlights
- Image-to-Text and Video-to-Text Support: The VLLM backend now supports both image-to-text and video-to-text processing.
- Enhanced Multimodal Support: Template placeholders are now available, offering more flexibility in multimodal applications
- Model Management Made Easy: List all your loaded models directly via the /system endpoint for seamless management.
- Various bugfixes and improvements: Fixed issues with dangling processes to ensure proper resource management and resolved channel closure issues in the base GRPC server.
🖼️ Multimodal vLLM
To use multimodal models with vLLM simply specify the model in the YAML file. Models however can differ if support multiple images or single images, along how they process internally placeholders for images.
Some models/libraries have different way to express images, videos or audio placeholders. For example, llama.cpp backend expects images within an [img-ID]
tag, but other backends/models (e.g. vLLM) use a different notation ( <|image_|>
).
For example, to override defaults, now it is possible to set in the model configuration the following:
template:
video: "<|video_{{.ID}}|> {{.Text}}"
image: "<|image_{{.ID}}|> {{.Text}}"
audio: "<|audio_{{.ID}}|> {{.Text}}"
📹 Video and Audio understanding
Some libraries might support both Video and Audio. Currently only vLLM supports Video understanding, and can be used in the API by "extending" the OpenAI API with audio
and video
type along images:
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What'\''s in this video?"
},
{
"type": "video_url",
"video_url": {
"url": "https://video-image-url"
}
}
]
}
],
"max_tokens": 300
}'
🧑🏭 Work in progress
- Realtime API is work in progress , tracked in #3714. thumbs up if you want to see it supported in LocalAI!
What's Changed
Bug fixes 🐛
- chore: simplify model loading by @mudler in #3715
- fix(initializer): correctly reap dangling processes by @mudler in #3717
- fix(base-grpc): close channel in base grpc server by @mudler in #3734
- fix(vllm): bump cmake - vllm requires it by @mudler in #3744
- fix(llama-cpp): consistently select fallback by @mudler in #3789
- fix(welcome): do not list model twice if we have a config by @mudler in #3790
- fix: listmodelservice / welcome endpoint use LOOSE_ONLY by @dave-gray101 in #3791
Exciting New Features 🎉
- feat(api): list loaded models in
/system
by @mudler in #3661 - feat: Add Get Token Metrics to GRPC server by @siddimore in #3687
- refactor: ListModels Filtering Upgrade by @dave-gray101 in #2773
- feat: track internally started models by ID by @mudler in #3693
- feat: tokenization endpoint by @shraddhazpy in #3710
- feat(multimodal): allow to template placeholders by @mudler in #3728
- feat(vllm): add support for image-to-text and video-to-text by @mudler in #3729
- feat(shutdown): allow force shutdown of backends by @mudler in #3733
- feat(transformers): Use downloaded model for Transformers backend if it already exists. by @joshbtn in #3777
- fix: roll out bluemonday Sanitize more widely by @dave-gray101 in #3794
🧠 Models
- models(gallery): add llama-3.2 3B and 1B by @mudler in #3671
- chore(model-gallery): ⬆️ update checksum by @localai-bot in #3675
- models(gallery): add magnusintellectus-12b-v1-i1 by @mudler in #3678
- models(gallery): add bigqwen2.5-52b-instruct by @mudler in #3679
- feat(api): add correlationID to Track Chat requests by @siddimore in #3668
- models(gallery): add replete-llm-v2.5-qwen-14b by @mudler in #3688
- models(gallery): add replete-llm-v2.5-qwen-7b by @mudler in #3689
- models(gallery): add calme-2.2-qwen2.5-72b-i1 by @mudler in #3691
- models(gallery): add salamandra-7b-instruct by @mudler in #3726
- models(gallery): add mn-backyardai-party-12b-v1-iq-arm-imatrix by @mudler in #3740
- models(gallery): add t.e-8.1-iq-imatrix-request by @mudler in #3741
- models(gallery): add violet_twilight-v0.2-iq-imatrix by @mudler in #3742
- models(gallery): add gemma-2-9b-it-abliterated by @mudler in #3743
- models(gallery): add moe-girl-1ba-7bt-i1 by @mudler in #3766
- models(gallery): add archfunctions models by @mudler in #3767
- models(gallery): add versatillama-llama-3.2-3b-instruct-abliterated by @mudler in #3771
- models(gallery): add llama3.2-3b-enigma by @mudler in #3772
- models(gallery): add llama3.2-3b-esper2 by @mudler in #3773
- models(gallery): add llama-3.1-swallow-70b-v0.1-i1 by @mudler in #3774
- models(gallery): add rombos-llm-v2.5.1-qwen-3b by @mudler in #3778
- models(gallery): add qwen2.5-7b-ins-v3 by @mudler in #3779
- models(gallery): add dans-personalityengine-v1.0.0-8b by @mudler in #3780
- models(gallery): add llama-3.2-3b-agent007 by @mudler in #3781
- models(gallery): add nihappy-l3.1-8b-v0.09 by @mudler in #3782
- models(gallery): add llama-3.2-3b-agent007-coder by @mudler in #3783
- models(gallery): add fireball-meta-llama-3.2-8b-instruct-agent-003-128k-code-dpo by @mudler in #3784
- models(gallery): add gemma-2-ataraxy-v3i-9b by @mudler in #3785
📖 Documentation and examples
👒 Dependencies
- chore: ⬆️ Update ggerganov/llama.cpp to
ea9c32be71b91b42ecc538bd902e93cbb5fb36cb
by @localai-bot in #3667 - chore: ⬆️ Update ggerganov/whisper.cpp to
69339af2d104802f3f201fd419163defba52890e
by @localai-bot in #3666 - chore: ⬆️ Update ggerganov/llama.cpp to
95bc82fbc0df6d48cf66c857a4dda3d044f45ca2
by @localai-bot in #3674 - chore: ⬆️ Update ggerganov/llama.cpp to
b5de3b74a595cbfefab7eeb5a567425c6a9690cf
by @localai-bot in #3681 - chore: ⬆️ Update ggerganov/whisper.cpp to
8feb375fbdf0277ad36958c218c6bf48fa0ba75a
by @localai-bot in #3680 - chore: ⬆️ Update ggerganov/llama.cpp to
c919d5db39c8a7fcb64737f008e4b105ee0acd20
by @localai-bot in #3686 - chore(deps): bump grpcio to 1.66.2 by @mudler in #3690
- chore(deps): Bump openai from 1.47.1 to 1.50.2 in /examples/langchain-chroma by @dependabot in #3697
- chore(deps): Bump chromadb from 0.5.7 to 0.5.11 in /examples/langchain-chroma by @dependabot in #3696
- chore(deps): Bump langchain from 0.3.0 to 0.3.1 in /examples/langchain-chroma by @dependabot in #3694
- chore: ⬆️ Update ggerganov/llama.cpp to
6f1d9d71f4c568778a7637ff6582e6f6ba5fb9d3
by @localai-bot in #3708 - chore(deps): Bump securego/gosec from 2.21.0 to 2.21.4 by @dependabot in #3698
- chore(deps): Bump openai from 1.47.1 to 1.50.2 in /examples/functions by @dependabot in #3699
- chore(deps): Bump langchain from 0.3.0 to 0.3.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3704
- chore(deps): Bump greenlet from 3.1.0 to 3.1.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3703
- chore(deps): Bump langchain from 0.3.0 to 0.3.1 in /examples/functions by @dependabot in #3700
- chore(deps): Bump langchain-community from 0.2.16 to 0.3.1 in /examples/langchain/langchainpy-localai-example by ...
v2.21.1
What's Changed
Bug fixes 🐛
👒 Dependencies
- chore(deps): Bump sentence-transformers from 3.1.0 to 3.1.1 in /backend/python/sentencetransformers by @dependabot in #3651
- chore(deps): Bump pydantic from 2.8.2 to 2.9.2 in /examples/langchain/langchainpy-localai-example by @dependabot in #3648
- chore(deps): Bump openai from 1.45.1 to 1.47.1 in /examples/functions by @dependabot in #3645
- chore: ⬆️ Update ggerganov/llama.cpp to
70392f1f81470607ba3afef04aa56c9f65587664
by @localai-bot in #3659 - chore(deps): Bump llama-index from 0.11.7 to 0.11.12 in /examples/langchain-chroma by @dependabot in #3639
- chore(deps): Bump openai from 1.45.1 to 1.47.1 in /examples/langchain-chroma by @dependabot in #3641
- chore(deps): Bump llama-index from 0.11.9 to 0.11.12 in /examples/chainlit by @dependabot in #3642
- chore: ⬆️ Update ggerganov/whisper.cpp to
0d2e2aed80109e8696791083bde3b58e190b7812
by @localai-bot in #3658 - chore(deps): Bump chromadb from 0.5.5 to 0.5.7 in /examples/langchain-chroma by @dependabot in #3640
Other Changes
- docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #3657
Full Changelog: v2.21.0...v2.21.1
v2.21.0
💡 Highlights!
LocalAI v2.21 release is out!
- Deprecation of the
exllama
backend - AIO images now have
gpt-4o
instead ofgpt-4-vision-preview
for Vision API - vLLM backend now supports embeddings
- New endpoint to list system information (
/system
) trust_remote_code
is now respected bysentencetransformers
- Auto warm-up and load models on start
coqui
backend switched to the community-maintained fork
What's Changed
Breaking Changes 🛠
- chore(exllama): drop exllama backend by @mudler in #3536
- chore(aio): rename gpt-4-vision-preview to gpt-4o by @mudler in #3597
Exciting New Features 🎉
- feat: elevenlabs
sound-generation
api by @dave-gray101 in #3355 - feat(vllm): add support for embeddings by @mudler in #3440
- feat: add endpoint to list system informations by @mudler in #3449
- feat: extract output with regexes from LLMs by @mudler in #3491
- feat: allow setting trust_remote_code for sentencetransformers backend by @Nyralei in #3552
- feat(api): allow to pass videos to backends by @mudler in #3601
- feat(api): allow to pass audios to backends by @mudler in #3603
- feat: auto load into memory on startup by @sozercan in #3627
- feat(coqui): switch to maintained community fork by @mudler in #3625
Bug fixes 🐛
- fix(p2p): correctly allow to pass extra args to llama.cpp by @mudler in #3368
- fix(model-loading): keep track of open GRPC Clients by @mudler in #3377
- fix(tts): check error before inspecting result by @mudler in #3415
- fix(shutdown): do not shutdown immediately busy backends by @mudler in #3543
- fix(parler-tts): fix install with sycl by @mudler in #3624
- fix(ci): fixup checksum scanning pipeline by @mudler in #3631
- fix(hipblas): do not push all variants to hipblas builds by @mudler in #3630
🧠 Models
- chore(model-gallery): add more quants for popular models by @mudler in #3365
- models(gallery): add phi-3.5 by @mudler in #3376
- models(gallery): add calme-2.1-phi3.5-4b-i1 by @mudler in #3383
- models(gallery): add magnum-v3-34b by @mudler in #3384
- models(gallery): add phi-3.5-vision by @mudler in #3421
- Revert "models(gallery): add phi-3.5-vision" by @mudler in #3422
- chore(model-gallery): ⬆️ update checksum by @localai-bot in #3425
- feat: Added Piper voice it-paola-medium by @fakezeta in #3434
- chore(model-gallery): ⬆️ update checksum by @localai-bot in #3442
- models(gallery): add hubble-4b-v1 by @mudler in #3444
- chore(model-gallery): ⬆️ update checksum by @localai-bot in #3446
- models(gallery): add yi-coder (and variants) by @mudler in #3482
- chore(model-gallery): ⬆️ update checksum by @localai-bot in #3486
- models(gallery): add reflection-llama-3.1-70b by @mudler in #3487
- models(gallery): add athena-codegemma-2-2b-it by @mudler in #3490
- models(gallery): add azure_dusk-v0.2-iq-imatrix by @mudler in #3538
- models(gallery): add mn-12b-lyra-v4-iq-imatrix by @mudler in #3539
- models(gallery): add datagemma models by @mudler in #3540
- models(gallery): add l3.1-8b-niitama-v1.1-iq-imatrix by @mudler in #3550
- models(gallery): add llama-3.1-8b-stheno-v3.4-iq-imatrix by @mudler in #3551
- fix:
gallery/index.yaml
comment spacing by @dave-gray101 in #3585 - models(gallery): add qwen2.5-14b-instruct by @mudler in #3607
- models(gallery): add qwen2.5-math-7b-instruct by @mudler in #3609
- models(gallery): add qwen2.5-14b_uncencored by @mudler in #3610
- models(gallery): add qwen2.5-coder-7b-instruct by @mudler in #3611
- models(gallery): add qwen2.5-math-72b-instruct by @mudler in #3612
- models(gallery): add qwen2.5-0.5b-instruct, qwen2.5-1.5b-instruct by @mudler in #3613
- models(gallery): add qwen2.5 32B, 72B, 32B Instruct by @mudler in #3614
- models(gallery): add llama-3.1-supernova-lite-reflection-v1.0-i1 by @mudler in #3615
- models(gallery): add llama-3.1-supernova-lite by @mudler in #3616
- models(gallery): add llama3.1-8b-shiningvaliant2 by @mudler in #3617
- models(gallery): add buddy2 by @mudler in #3618
- models(gallery): add llama-3.1-8b-arliai-rpmax-v1.1 by @mudler in #3619
- Fix NeuralDaredevil URL by @nyx4ris in #3621
- models(gallery): add nightygurps-14b-v1.1 by @mudler in #3633
- models(gallery): add gemma-2-9b-arliai-rpmax-v1.1 by @mudler in #3634
- models(gallery): add gemma-2-2b-arliai-rpmax-v1.1 by @mudler in #3635
- models(gallery): add acolyte-22b-i1 by @mudler in #3636
📖 Documentation and examples
- docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #3366
- chore(docs): add Vulkan images links by @mudler in #3620
👒 Dependencies
- chore: ⬆️ Update ggerganov/llama.cpp to
3ba780e2a8f0ffe13f571b27f0bbf2ca5a199efc
by @localai-bot in #3361 - chore(deps): Bump openai from 1.41.1 to 1.42.0 in /examples/functions by @dependabot in #3390
- chore(deps): Bump docs/themes/hugo-theme-relearn from
82a5e98
to3a0ae52
by @dependabot in #3391 - chore(deps): Bump idna from 3.7 to 3.8 in /examples/langchain/langchainpy-localai-example by @dependabot in #3399
- chore(deps): Bump llama-index from 0.10.65 to 0.11.1 in /examples/chainlit by @dependabot in #3404
- chore(deps): Bump llama-index from 0.10.67.post1 to 0.11.1 in /examples/langchain-chroma by @dependabot in #3406
- chore(deps): Bump marshmallow from 3.21.3 to 3.22.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3400
- chore(deps): Bump openai from 1.40.5 to 1.42.0 in /examples/langchain-chroma by @dependabot in #3405
- chore(deps): Bump openai from 1.41.1 to 1.42.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3401
- chore(deps): update edgevpn to v0.28 by @mudler in #3412
- chore(deps): Bump langchain from 0.2.14 to 0.2.15 in /examples/functions by @dependabot in #3453
- chore(deps): Bump certifi from 2024.7.4 to 2024.8.30 in /examples/langchain/langchainpy-localai-example by @dependabot in #3457
- chore(deps): Bump yarl from 1.9.4 to 1.9.7 in /examples/langchain/langchainpy-localai-example by @dependabot in #3459
- chore(deps): Bump langchain-community from 0.2.12 to 0.2.15 in /examples/langchain/langchainpy-localai-example by @dependabot in #3461
- chore(deps): Bump llama-index from 0.11.1 to 0.11.4 in /examples/chainlit by @dependabot in #3462
- chore(deps): Bump llama-index from 0.11.1 to 0.11.4 in /examples/langchain-chroma by @dependabot in #3467
- chore(deps): Bump docs/themes/hugo-theme-relearn from
3a0ae52
to550a6ee
by @dependabot in #3472 - chore(deps): Bump openai from 1.42.0 to 1.43.0 in /examples/functions by @dependabot in #3452
- chore(deps): Bump langchain from 0.2.14 to 0.2.15 in /examples/langchain/langchainpy-localai-example by @dependabot in #3460
- chore(deps): Bump openai from 1.42.0 to 1.43.0 in /examples/langchain-chroma by @dependabot in #3468
- chore(deps): Bump langchain from 0.2.14 to 0.2.15 in /examples/langchain-chroma by ...
v2.20.1
It's that time again—I’m excited (and honestly, a bit proud) to announce the release of LocalAI v2.20! This one’s a biggie, with some of the most requested features and enhancements, all designed to make your self-hosted AI journey even smoother and more powerful.
TL;DR
- 🌍 Explorer & Community: Explore global community pools at explorer.localai.io
- 👀 Demo instance available: Test out LocalAI at demo.localai.io
- 🤗 Integration: Hugging Face Local apps now include LocalAI
- 🐛 Bug Fixes: Diffusers and hipblas issues resolved
- 🎨 New Feature: FLUX-1 image generation support
- 🏎️ Strict Mode: Stay compliant with OpenAI’s latest API changes
- 💪 Multiple P2P Clusters: Run multiple clusters within the same network
- 🧪 Deprecation Notice:
gpt4all.cpp
andpetals
backends deprecated
🌍 Explorer and Global Community Pools
Now you can share your LocalAI instance with the global community or explore available instances by visiting explorer.localai.io. This decentralized network powers our demo instance, creating a truly collaborative AI experience.
How It Works
Using the Explorer, you can easily share or connect to clusters. For detailed instructions on creating new clusters or connecting to existing ones, check out our documentation.
👀 Demo Instance Now Available
Curious about what LocalAI can do? Dive right in with our live demo at demo.localai.io! Thanks to our generous sponsors, this instance is publicly available and configured via peer-to-peer (P2P) networks. If you'd like to connect, follow the instructions here.
🤗 Hugging Face Integration
I am excited to announce that LocalAI is now integrated within Hugging Face’s local apps! This means you can select LocalAI directly within Hugging Face to build and deploy models with the power and flexibility of our platform. Experience seamless integration with a single click!
This integration was made possible through this PR.
🎨 FLUX-1 Image Generation Support
FLUX-1 lands in LocalAI! With this update, LocalAI can now generate stunning images using FLUX-1, even in federated mode. Whether you're experimenting with new designs or creating production-quality visuals, FLUX-1 has you covered.
Try it out at demo.localai.io and see what LocalAI + FLUX-1 can do!
🐛 Diffusers and hipblas Fixes
Great news for AMD users! If you’ve encountered issues with the Diffusers backend or hipblas, those bugs have been resolved. We’ve transitioned to uv
for managing Python dependencies, ensuring a smoother experience. For more details, check out Issue #1592.
🏎️ Strict Mode for API Compliance
To stay up to date with OpenAI’s latest changes, now LocalAI have support as well for Strict Mode ( https://openai.com/index/introducing-structured-outputs-in-the-api/ ). This new feature ensures compatibility with the most recent API updates, enforcing stricter JSON outputs using BNF grammar rules.
To activate, simply set strict: true
in your API calls, even if it’s disabled in your configuration.
Key Notes:
- Setting
strict: true
enables grammar enforcement, even if disabled in your config. - If
format_type
is set tojson_schema
, BNF grammars will be automatically generated from the schema.
🛑 Disable Gallery
Need to streamline your setup? You can now disable the gallery endpoint using LOCALAI_DISABLE_GALLERY_ENDPOINT
. For more options, check out the full list of commands with --help
.
🌞 P2P and Federation Enhancements
Several enhancements have been made to improve your experience with P2P and federated clusters:
- Load Balancing by Default: This feature is now enabled by default (disable it with
LOCALAI_RANDOM_WORKER
if needed). - Target Specific Workers: Directly target workers in federated mode using
LOCALAI_TARGET_WORKER
.
💪 Run Multiple P2P Clusters in the Same Network
You can now run multiple clusters within the same network by specifying a network ID via CLI. This allows you to logically separate clusters while using the same shared token. Just set LOCALAI_P2P_NETWORK_ID
to a UUID that matches across instances.
Please note, while this offers segmentation, it’s not fully secure—anyone with the network token can view available services within the network.
🧪 Deprecation Notice: gpt4all.cpp
and petals
Backends
As we continue to evolve, we are officially deprecating the gpt4all.cpp
and petals
backends. The newer llama.cpp
offers a superior set of features and better performance, making it the preferred choice moving forward.
From this release onward, gpt4all
models in ggml
format are no longer compatible. Additionally, the petals
backend has been deprecated as well. LocalAI’s new P2P capabilities now offer a comprehensive replacement for these features.
What's Changed
Breaking Changes 🛠
Bug fixes 🐛
- fix(ui): do not show duplicate entries if not installed by gallery by @mudler in #3107
- fix: be consistent in downloading files, check for scanner errors by @mudler in #3108
- fix: ensure correct version of torch is always installed based on BUI… by @cryptk in #2890
- fix(python): move accelerate and GPU-specific libs to build-type by @mudler in #3194
- fix(apple): disable BUILD_TYPE metal on fallback by @mudler in #3199
- fix(vall-e-x): pin hipblas deps by @mudler in #3201
- fix(diffusers): use nightly rocm for hipblas builds by @mudler in #3202
- fix(explorer): reset counter when network is active by @mudler in #3213
- fix(p2p): allocate tunnels only when needed by @mudler in #3259
- fix(gallery): be consistent and disable UI routes as well by @mudler in #3262
- fix(parler-tts): bump and require after build type deps by @mudler in #3272
- fix: add llvm to extra images by @mudler in #3321
- fix(p2p): re-use p2p host when running federated mode by @mudler in #3341
- fix(ci): pin to llvmlite 0.43 by @mudler in #3342
- fix(p2p): avoid starting the node twice by @mudler in #3349
- fix(chat): re-generated uuid, created, and text on each request by @mudler in #3359
Exciting New Features 🎉
- feat(guesser): add gemma2 by @sozercan in #3118
- feat(venv): shared env by @mudler in #3195
- feat(openai): add
json_schema
format type and strict mode by @mudler in #3193 - feat(p2p): allow to run multiple clusters in the same p2p network by @mudler in #3128
- feat(p2p): add network explorer and community pools by @mudler in #3125
- feat(explorer): relax token deletion with error threshold by @mudler in #3211
- feat(diffusers): support flux models by @mudler in #3129
- feat(explorer): make possible to run sync in a separate process by @mudler in #3224
- feat(federated): allow to pickup a specific worker, improve loadbalancing by @mudler in #3243
- feat: Initial Version of vscode DevContainer by @dave-gray101 in #3217
- feat(explorer): visual improvements by @mudler in #3247
- feat(gallery): lazy load images by @mudler in #3246
- chore(explorer): add join instructions by @mudler in #3255
- chore: allow to disable gallery endpoints, improve p2p connection handling by @mudler in #3256
- chore(ux): add animated header with anime.js in p2p sections by @mudler in #3271
- chore(p2p): make commands easier to copy-paste by @mudler in #3273
- chore(ux): ...
v2.20.0
TL;DR
- 🌍 Explorer & Community: Explore global community pools at explorer.localai.io
- 👀 Demo instance available: Test out LocalAI at demo.localai.io
- 🤗 Integration: Hugging Face Local apps now include LocalAI
- 🐛 Bug Fixes: Diffusers and hipblas issues resolved
- 🎨 New Feature: FLUX-1 image generation support
- 🏎️ Strict Mode: Stay compliant with OpenAI’s latest API changes
- 💪 Multiple P2P Clusters: Run multiple clusters within the same network
- 🧪 Deprecation Notice:
gpt4all.cpp
andpetals
backends deprecated
🌍 Explorer and Global Community Pools
Now you can share your LocalAI instance with the global community or explore available instances by visiting explorer.localai.io. This decentralized network powers our demo instance, creating a truly collaborative AI experience.
How It Works
Using the Explorer, you can easily share or connect to clusters. For detailed instructions on creating new clusters or connecting to existing ones, check out our documentation.
👀 Demo Instance Now Available
Curious about what LocalAI can do? Dive right in with our live demo at demo.localai.io! Thanks to our generous sponsors, this instance is publicly available and configured via peer-to-peer (P2P) networks. If you'd like to connect, follow the instructions here.
🤗 Hugging Face Integration
I am excited to announce that LocalAI is now integrated within Hugging Face’s local apps! This means you can select LocalAI directly within Hugging Face to build and deploy models with the power and flexibility of our platform. Experience seamless integration with a single click!
This integration was made possible through this PR.
🎨 FLUX-1 Image Generation Support
FLUX-1 lands in LocalAI! With this update, LocalAI can now generate stunning images using FLUX-1, even in federated mode. Whether you're experimenting with new designs or creating production-quality visuals, FLUX-1 has you covered.
Try it out at demo.localai.io and see what LocalAI + FLUX-1 can do!
🐛 Diffusers and hipblas Fixes
Great news for AMD users! If you’ve encountered issues with the Diffusers backend or hipblas, those bugs have been resolved. We’ve transitioned to uv
for managing Python dependencies, ensuring a smoother experience. For more details, check out Issue #1592.
🏎️ Strict Mode for API Compliance
To stay up to date with OpenAI’s latest changes, now LocalAI have support as well for Strict Mode ( https://openai.com/index/introducing-structured-outputs-in-the-api/ ). This new feature ensures compatibility with the most recent API updates, enforcing stricter JSON outputs using BNF grammar rules.
To activate, simply set strict: true
in your API calls, even if it’s disabled in your configuration.
Key Notes:
- Setting
strict: true
enables grammar enforcement, even if disabled in your config. - If
format_type
is set tojson_schema
, BNF grammars will be automatically generated from the schema.
🛑 Disable Gallery
Need to streamline your setup? You can now disable the gallery endpoint using LOCALAI_DISABLE_GALLERY_ENDPOINT
. For more options, check out the full list of commands with --help
.
🌞 P2P and Federation Enhancements
Several enhancements have been made to improve your experience with P2P and federated clusters:
- Load Balancing by Default: This feature is now enabled by default (disable it with
LOCALAI_RANDOM_WORKER
if needed). - Target Specific Workers: Directly target workers in federated mode using
LOCALAI_TARGET_WORKER
.
💪 Run Multiple P2P Clusters in the Same Network
You can now run multiple clusters within the same network by specifying a network ID via CLI. This allows you to logically separate clusters while using the same shared token. Just set LOCALAI_P2P_NETWORK_ID
to a UUID that matches across instances.
Please note, while this offers segmentation, it’s not fully secure—anyone with the network token can view available services within the network.
🧪 Deprecation Notice: gpt4all.cpp
and petals
Backends
As we continue to evolve, we are officially deprecating the gpt4all.cpp
and petals
backends. The newer llama.cpp
offers a superior set of features and better performance, making it the preferred choice moving forward.
From this release onward, gpt4all
models in ggml
format are no longer compatible. Additionally, the petals
backend has been deprecated as well. LocalAI’s new P2P capabilities now offer a comprehensive replacement for these features.
What's Changed
Breaking Changes 🛠
Bug fixes 🐛
- fix(ui): do not show duplicate entries if not installed by gallery by @mudler in #3107
- fix: be consistent in downloading files, check for scanner errors by @mudler in #3108
- fix: ensure correct version of torch is always installed based on BUI… by @cryptk in #2890
- fix(python): move accelerate and GPU-specific libs to build-type by @mudler in #3194
- fix(apple): disable BUILD_TYPE metal on fallback by @mudler in #3199
- fix(vall-e-x): pin hipblas deps by @mudler in #3201
- fix(diffusers): use nightly rocm for hipblas builds by @mudler in #3202
- fix(explorer): reset counter when network is active by @mudler in #3213
- fix(p2p): allocate tunnels only when needed by @mudler in #3259
- fix(gallery): be consistent and disable UI routes as well by @mudler in #3262
- fix(parler-tts): bump and require after build type deps by @mudler in #3272
- fix: add llvm to extra images by @mudler in #3321
- fix(p2p): re-use p2p host when running federated mode by @mudler in #3341
- fix(ci): pin to llvmlite 0.43 by @mudler in #3342
- fix(p2p): avoid starting the node twice by @mudler in #3349
- fix(chat): re-generated uuid, created, and text on each request by @mudler in #3359
Exciting New Features 🎉
- feat(guesser): add gemma2 by @sozercan in #3118
- feat(venv): shared env by @mudler in #3195
- feat(openai): add
json_schema
format type and strict mode by @mudler in #3193 - feat(p2p): allow to run multiple clusters in the same p2p network by @mudler in #3128
- feat(p2p): add network explorer and community pools by @mudler in #3125
- feat(explorer): relax token deletion with error threshold by @mudler in #3211
- feat(diffusers): support flux models by @mudler in #3129
- feat(explorer): make possible to run sync in a separate process by @mudler in #3224
- feat(federated): allow to pickup a specific worker, improve loadbalancing by @mudler in #3243
- feat: Initial Version of vscode DevContainer by @dave-gray101 in #3217
- feat(explorer): visual improvements by @mudler in #3247
- feat(gallery): lazy load images by @mudler in #3246
- chore(explorer): add join instructions by @mudler in #3255
- chore: allow to disable gallery endpoints, improve p2p connection handling by @mudler in #3256
- chore(ux): add animated header with anime.js in p2p sections by @mudler in #3271
- chore(p2p): make commands easier to copy-paste by @mudler in #3273
- chore(ux): allow to create and drag dots in the animation by @mudler in #3287
- feat(federation): do not allocate local services for load balancing by @mudler in #3337
- feat(p2p): allow to set intervals b...
v2.19.4
What's Changed
🧠 Models
- chore(model-gallery): ⬆️ update checksum by @localai-bot in #3040
- chore(model-gallery): ⬆️ update checksum by @localai-bot in #3043
- models(gallery): add magnum-32b-v1 by @mudler in #3044
- models(gallery): add lumimaid-v0.2-70b-i1 by @mudler in #3045
- models(gallery): add sekhmet_aleph-l3.1-8b-v0.1-i1 by @mudler in #3046
- models(gallery): add l3.1-8b-llamoutcast-i1 by @mudler in #3047
- models(gallery): add l3.1-8b-celeste-v1.5 by @mudler in #3080
- models(gallery): add llama-guard-3-8b by @mudler in #3082
- models(gallery): add meta-llama-3-instruct-8.9b-brainstorm-5x-form-11 by @mudler in #3083
- models(gallery): add sunfall-simpo by @mudler in #3088
- models(gallery): add genius-llama3.1-i1 by @mudler in #3089
- models(gallery): add seeker-9b by @mudler in #3090
- models(gallery): add llama3.1-chinese-chat by @mudler in #3091
- models(gallery): add gemmasutra-pro-27b-v1 by @mudler in #3092
- models(gallery): add leetwizard by @mudler in #3093
- models(gallery): add tarnished-9b-i1 by @mudler in #3096
- models(gallery): add meta-llama-3-instruct-12.2b-brainstorm-20x-form-8 by @mudler in #3097
- models(gallery): add loki-base-i1 by @mudler in #3098
- models(gallery): add tifa by @mudler in #3099
👒 Dependencies
- chore(deps): Bump langchain from 0.2.10 to 0.2.11 in /examples/langchain/langchainpy-localai-example by @dependabot in #3053
- chore(deps): Bump openai from 1.37.0 to 1.37.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3051
- chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python/autogptq by @dependabot in #3048
- chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python/vllm by @dependabot in #3061
- chore(deps): Bump chromadb from 0.5.4 to 0.5.5 in /examples/langchain-chroma by @dependabot in #3060
- chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python/parler-tts by @dependabot in #3062
- chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python/rerankers by @dependabot in #3067
- chore(deps): Bump setuptools from 69.5.1 to 72.1.0 in /backend/python/transformers-musicgen by @dependabot in #3066
- chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python/coqui by @dependabot in #3068
- chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python/vall-e-x by @dependabot in #3069
- chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python/petals by @dependabot in #3070
- chore(deps): Bump setuptools from 69.5.1 to 72.1.0 in /backend/python/transformers by @dependabot in #3071
- chore(deps): Bump streamlit from 1.36.0 to 1.37.0 in /examples/streamlit-bot by @dependabot in #3072
Other Changes
- docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #3039
- fix: install.sh bash specific equality check by @dave-gray101 in #3038
- chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #3075
- Revert "chore(deps): Bump setuptools from 69.5.1 to 72.1.0 in /backend/python/transformers-musicgen" by @mudler in #3077
- Revert "chore(deps): Bump setuptools from 69.5.1 to 72.1.0 in /backend/python/transformers" by @mudler in #3078
- Revert "chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python/vllm" by @mudler in #3079
- fix(llama-cpp): do not compress with UPX by @mudler in #3084
- fix(ci): update openvoice checkpoints URLs by @mudler in #3085
- chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #3086
- chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #3102
Full Changelog: v2.19.3...v2.19.4
v2.19.3
What's Changed
Bug fixes 🐛
- fix(gallery): do not attempt to delete duplicate files by @mudler in #3031
- fix(gallery): do clear out errors once displayed by @mudler in #3033
Exciting New Features 🎉
🧠 Models
- models(gallery): add llama3.1-claude by @mudler in #3005
- models(gallery): add darkidol llama3.1 by @mudler in #3008
- models(gallery): add gemmoy by @mudler in #3009
- chore: add function calling template for llama 3.1 models by @mudler in #3010
- chore: models(gallery): ⬆️ update checksum by @localai-bot in #3013
- models(gallery): add mistral-nemo by @mudler in #3019
- models(gallery): add llama3.1-8b-fireplace2 by @mudler in #3018
- models(gallery): add lumimaid-v0.2-12b by @mudler in #3020
- models(gallery): add darkidol-llama-3.1-8b-instruct-1.1-uncensored-iq… by @mudler in #3021
- models(gallery): add meta-llama-3.1-8b-instruct-abliterated by @mudler in #3022
- models(gallery): add llama-3.1-70b-japanese-instruct-2407 by @mudler in #3023
- models(gallery): add llama-3.1-8b-instruct-fei-v1-uncensored by @mudler in #3024
- models(gallery): add openbuddy-llama3.1-8b-v22.1-131k by @mudler in #3025
- models(gallery): add lumimaid-8b by @mudler in #3026
- models(gallery): add llama3 with enforced functioncall with grammars by @mudler in #3027
- chore(model-gallery): ⬆️ update checksum by @localai-bot in #3036
👒 Dependencies
- chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #3003
- chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #3012
- chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #3016
- chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #3030
- chore: ⬆️ Update ggerganov/whisper.cpp by @localai-bot in #3029
- chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #3034
Other Changes
- docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #3002
- refactor: break down json grammar parser in different files by @mudler in #3004
- fix: PR title tag for checksum checker script workflow by @dave-gray101 in #3014
Full Changelog: v2.19.2...v2.19.3
v2.19.2
This release is a patch release to fix well known issues from 2.19.x
What's Changed
Bug fixes 🐛
- fix: pin setuptools 69.5.1 by @fakezeta in #2949
- fix(cuda): downgrade to 12.0 to increase compatibility range by @mudler in #2994
- fix(llama.cpp): do not set anymore lora_base by @mudler in #2999
Exciting New Features 🎉
- ci(Makefile): reduce binary size by compressing by @mudler in #2947
- feat(p2p): warn the user to start with --p2p by @mudler in #2993
🧠 Models
- models(gallery): add tulu 8b and 70b by @mudler in #2931
- models(gallery): add suzume-orpo by @mudler in #2932
- models(gallery): add archangel_sft_pythia2-8b by @mudler in #2933
- models(gallery): add celestev1.2 by @mudler in #2937
- models(gallery): add calme-2.3-phi3-4b by @mudler in #2939
- models(gallery): add calme-2.8-qwen2-7b by @mudler in #2940
- models(gallery): add StellarDong-72b by @mudler in #2941
- models(gallery): add calme-2.4-llama3-70b by @mudler in #2942
- models(gallery): add llama3.1 70b and 8b by @mudler in #3000
📖 Documentation and examples
- docs: add federation by @mudler in #2929
- docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #2935
👒 Dependencies
- chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2936
- chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2943
- chore(deps): Bump grpcio from 1.64.1 to 1.65.1 in /backend/python/openvoice by @dependabot in #2956
- chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/sentencetransformers by @dependabot in #2955
- chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/bark by @dependabot in #2951
- chore(deps): Bump docs/themes/hugo-theme-relearn from
1b2e139
to7aec99b
by @dependabot in #2952 - chore(deps): Bump langchain from 0.2.8 to 0.2.10 in /examples/langchain/langchainpy-localai-example by @dependabot in #2959
- chore(deps): Bump numpy from 1.26.4 to 2.0.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #2958
- chore(deps): Bump sqlalchemy from 2.0.30 to 2.0.31 in /examples/langchain/langchainpy-localai-example by @dependabot in #2957
- chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/vllm by @dependabot in #2964
- chore(deps): Bump llama-index from 0.10.55 to 0.10.56 in /examples/chainlit by @dependabot in #2966
- chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/common/template by @dependabot in #2963
- chore(deps): Bump weaviate-client from 4.6.5 to 4.6.7 in /examples/chainlit by @dependabot in #2965
- chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/transformers by @dependabot in #2970
- chore(deps): Bump openai from 1.35.13 to 1.37.0 in /examples/functions by @dependabot in #2973
- chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/diffusers by @dependabot in #2969
- chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/exllama2 by @dependabot in #2971
- chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/rerankers by @dependabot in #2974
- chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/coqui by @dependabot in #2980
- chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/parler-tts by @dependabot in #2982
- chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/vall-e-x by @dependabot in #2981
- chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/transformers-musicgen by @dependabot in #2990
- chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/autogptq by @dependabot in #2984
- chore(deps): Bump llama-index from 0.10.55 to 0.10.56 in /examples/langchain-chroma by @dependabot in #2986
- chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/mamba by @dependabot in #2989
- chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2992
- chore(deps): Bump langchain-community from 0.2.7 to 0.2.9 in /examples/langchain/langchainpy-localai-example by @dependabot in #2960
- chore(deps): Bump openai from 1.35.13 to 1.37.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #2961
- chore(deps): Bump langchain from 0.2.8 to 0.2.10 in /examples/functions by @dependabot in #2975
- chore(deps): Bump openai from 1.35.13 to 1.37.0 in /examples/langchain-chroma by @dependabot in #2988
- chore(deps): Bump langchain from 0.2.8 to 0.2.10 in /examples/langchain-chroma by @dependabot in #2987
- chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2995
Other Changes
Full Changelog: v2.19.1...v2.19.2
v2.19.1
LocalAI 2.19.1 is out! 📣
TLDR; Summary spotlight
- 🖧 Federated Instances via P2P: LocalAI now supports federated instances with P2P, offering both load-balanced and non-load-balanced options.
- 🎛️ P2P Dashboard: A new dashboard to guide and assist in setting up P2P instances with auto-discovery using shared tokens.
- 🔊 TTS Integration: Text-to-Speech (TTS) is now included in the binary releases.
- 🛠️ Enhanced Installer: The installer script now supports setting up federated instances.
- 📥 Model Pulling: Models can now be pulled directly via URL.
- 🖼️ WebUI Enhancements: Visual improvements and cleanups to the WebUI and model lists.
- 🧠 llama-cpp Backend: The llama-cpp (grpc) backend now supports embedding ( https://localai.io/features/embeddings/#llamacpp-embeddings )
- ⚙️ Tool Support: Small enhancements to tools with disabled grammars.
🖧 LocalAI Federation and AI swarms
LocalAI is revolutionizing the future of distributed AI workloads by making it simpler and more accessible. No more complex setups, Docker or Kubernetes configurations – LocalAI allows you to create your own AI cluster with minimal friction. By auto-discovering and sharing work or weights of the LLM model across your existing devices, LocalAI aims to scale both horizontally and vertically with ease.
How it works?
Starting LocalAI with --p2p
generates a shared token for connecting multiple instances: and that's all you need to create AI clusters, eliminating the need for intricate network setups. Simply navigate to the "Swarm" section in the WebUI and follow the on-screen instructions.
For fully shared instances, initiate LocalAI with --p2p --federated
and adhere to the Swarm section's guidance. This feature, while still experimental, offers a tech preview quality experience.
Federated LocalAI
Launch multiple LocalAI instances and cluster them together to share requests across the cluster. The "Swarm" tab in the WebUI provides one-liner instructions on connecting various LocalAI instances using a shared token. Instances will auto-discover each other, even across different networks.
Check out a demonstration video: Watch now
LocalAI P2P Workers
Distribute weights across nodes by starting multiple LocalAI workers, currently available only on the llama.cpp backend, with plans to expand to other backends soon.
Check out a demonstration video: Watch now
What's Changed
Bug fixes 🐛
- fix: make sure the GNUMake jobserver is passed to cmake for the llama.cpp build by @cryptk in #2697
- Using exec when starting a backend instead of spawning a new process by @a17t in #2720
- fix(cuda): downgrade default version from 12.5 to 12.4 by @mudler in #2707
- fix: Lora loading by @vaaale in #2893
- fix: short-circuit when nodes aren't detected by @mudler in #2909
- fix: do not list txt files as potential models by @mudler in #2910
🖧 P2P area
- feat(p2p): Federation and AI swarms by @mudler in #2723
- feat(p2p): allow to disable DHT and use only LAN by @mudler in #2751
Exciting New Features 🎉
- Allows to remove a backend from the list by @mauromorales in #2721
- ci(Makefile): adds tts in binary releases by @mudler in #2695
- feat: HF
/scan
endpoint by @dave-gray101 in #2566 - feat(model-list): be consistent, skip known files from listing by @mudler in #2760
- feat(models): pull models from urls by @mudler in #2750
- feat(webui): show also models without a config in the welcome page by @mudler in #2772
- feat(install.sh): support federated install by @mudler in #2752
- feat(llama.cpp): support embeddings endpoints by @mudler in #2871
- feat(functions): parse broken JSON when we parse the raw results, use dynamic rules for grammar keys by @mudler in #2912
- feat(federation): add load balanced option by @mudler in #2915
🧠 Models
- models(gallery): ⬆️ update checksum by @localai-bot in #2701
- models(gallery): add l3-8b-everything-cot by @mudler in #2705
- models(gallery): add hercules-5.0-qwen2-7b by @mudler in #2708
- models(gallery): add llama3-8b-darkidol-2.2-uncensored-1048k-iq-imatrix by @mudler in #2710
- models(gallery): add llama-3-llamilitary by @mudler in #2711
- models(gallery): add tess-v2.5-gemma-2-27b-alpha by @mudler in #2712
- models(gallery): add arcee-agent by @mudler in #2713
- models(gallery): add gemma2-daybreak by @mudler in #2714
- models(gallery): add L3-Stheno-Maid-Blackroot-Grand-HORROR-16B-GGUF by @mudler in #2715
- models(gallery): add qwen2-7b-instruct-v0.8 by @mudler in #2717
- models(gallery): add internlm2_5-7b-chat-1m by @mudler in #2719
- models(gallery): add gemma-2-9b-it-sppo-iter3 by @mudler in #2722
- models(gallery): add llama-3_8b_unaligned_alpha by @mudler in #2727
- models(gallery): add l3-8b-lunaris-v1 by @mudler in #2729
- models(gallery): add llama-3_8b_unaligned_alpha_rp_soup-i1 by @mudler in #2734
- models(gallery): add hathor_respawn-l3-8b-v0.8 by @mudler in #2738
- models(gallery): add llama3-8b-instruct-replete-adapted by @mudler in #2739
- models(gallery): add llama-3-perky-pat-instruct-8b by @mudler in #2740
- models(gallery): add l3-uncen-merger-omelette-rp-v0.2-8b by @mudler in #2741
- models(gallery): add nymph_8b-i1 by @mudler in #2742
- models(gallery): add smegmma-9b-v1 by @mudler in #2743
- models(gallery): add hathor_tahsin-l3-8b-v0.85 by @mudler in #2762
- models(gallery): add replete-coder-instruct-8b-merged by @mudler in #2782
- models(gallery): add arliai-llama-3-8b-formax-v1.0 by @mudler in #2783
- models(gallery): add smegmma-deluxe-9b-v1 by @mudler in #2784
- models(gallery): add l3-ms-astoria-8b by @mudler in #2785
- models(gallery): add halomaidrp-v1.33-15b-l3-i1 by @mudler in #2786
- models(gallery): add llama-3-patronus-lynx-70b-instruct by @mudler in #2788
- models(gallery): add llamax3 by @mudler in #2849
- models(gallery): add arliai-llama-3-8b-dolfin-v0.5 by @mudler in #2852
- models(gallery): add tiger-gemma-9b-v1-i1 by @mudler in #2853
- feat: models(gallery): add deepseek-v2-lite by @mudler in #2658
- models(gallery): ⬆️ update checksum by @localai-bot in #2860
- models(gallery): add phi-3.1-mini-4k-instruct by @mudler in #2863
- models(gallery): ⬆️ update checksum by @localai-bot in #2887
- models(gallery): add ezo model series (llama3, gemma) by @mudler in #2891
- models(gallery): add l3-8b-niitama-v1 by @mudler in #2895
- models(gallery): add mathstral-7b-v0.1-imat by @mudler in #2901
- models(gallery): add MythicalMaid/EtherealMaid 15b by @mudler in #2902
- models(gallery): add flammenai/Mahou-1.3d-mistral-7B by @mudler in #2903
- models(gallery): add big-tiger-gemma-27b-v1 by @mudler in #2918
- models(gallery): add phillama-3.8b-v0.1 by @mudler in #2920
- models(gallery): add qwen2-wukong-7b by @mudler in #2921
- models(gallery): add einstein-v4-7b by @mudler in #2922
- models(gallery): add gemma-2b-translation-v0.150 by @mudler in #2923
- models(gallery)...