-
Notifications
You must be signed in to change notification settings - Fork 883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error: llama runner process has terminated: GGML_ASSERT(src1t == GGML_TYPE_F32) failed #1043
Comments
I would file an issue with the https://github.com/ollama/ollama folks. It's not clear to me this is an issue with MLX.. |
@awni Could it be due to GGUF exported by mlx_lm is F16 and the comman I used to create the model (ollama create example -f Modelfile) is wrong or certain setting is required? "Export the fused model to GGUF. Note GGUF support is limited to Mistral, Mixtral, and Llama style models in fp16 precision." Reference: https://github.com/ml-explore/mlx-examples/blob/main/llms/mlx_lm/LORA.md |
I have the same. Making a guff after the fuse with llama.cpp does work when running it in ollama: python convert_hf_to_gguf.py <path_to>/fused_model --outfile output_file.gguf then in the ollama MODELFILE, put (with the parameters and template): |
@lhwong @hansvdam
Fuse the model without gguf
Model-File
And import it: |
I got the following error when running model Imported from GGUF which is generated from the model fine-tuned with LoRA.
Error: llama runner process has terminated: GGML_ASSERT(src1t == GGML_TYPE_F32) failed
The following are commands used
mlx_lm.lora --train --model meta-llama/Llama-3.2-1B --data ~/Projects/AI/data --iters 1000
mlx_lm.generate --model meta-llama/Llama-3.2-1B --adapter-path ./adapters --prompt "What is biomolecule?"
mlx_lm.fuse --model meta-llama/Llama-3.2-1B --adapter-path ./adapters --export-gguf
Create Modelfile
FROM ./fused_model/ggml-model-f16.gguf
ollama create example -f Modelfile
ollama run example
Error: llama runner process has terminated: GGML_ASSERT(src1t == GGML_TYPE_F32) failed
/Users/runner/work/ollama/ollama/llm/llama.cpp/ggml/src/ggml-metal.m:1080: GGML_ASSERT(src1t == GGML_TYPE_F32) failed
/Users/runner/work/ollama/ollama/llm/llama.cpp/ggml/src/ggml-metal.m:1080: GGML_ASSERT(src1t == GGML_TYPE_F32) failed
The text was updated successfully, but these errors were encountered: