You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I believe this is a way to improve. I'll try to join the Continue Discord for questions
I'm not able to find an open issue that requests the same enhancement
Problem
Chat is very slow when using with the llama.cpp server when increasing the message because missing cache_prompt = true when call the /completion API of llama.cpp, so llama.cpp process also all previous message history when prompt instead of using from cache.
Solution
Please help add more property cache_prompt = true when call the /completion API,
or add more configuration property into the config.json
Thanks
The text was updated successfully, but these errors were encountered:
Validations
Problem
Chat is very slow when using with the llama.cpp server when increasing the message because missing cache_prompt = true when call the /completion API of llama.cpp, so llama.cpp process also all previous message history when prompt instead of using from cache.
Solution
Please help add more property cache_prompt = true when call the /completion API,
or add more configuration property into the config.json
Thanks
The text was updated successfully, but these errors were encountered: