You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello 👋🏻 Just started with llama.cpp and have a couple of questions. I downloaded a model using the llama-cli -hf like this according the instructions on the unsloth site.
It runs, I can interact with it. But now I want to run it using llama-server to use it over http requests. I tried a couple of options with the model name but non of the worked.
llama-server -m unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF:Q4_K_XL
llama-server -m unsloth_Qwen3-Coder-30B-A3B-Instruct-GGUF_Qwen3-Coder-30B-A3B-Instruct-Q4_K_M.gguf # model name from the .cache/llama.cpp folder
Getting this error:
build: 983 (f12b193) with cc (GCC) 15.1.1 20250729 for x86_64-pc-linux-gnu
system info: n_threads = 12, n_threads_batch = 12, total_threads = 24
system_info: n_threads = 12 (n_threads_batch = 12) / 24 | CPU : LLAMAFILE = 1 | OPENMP = 1 | REPACK = 1 |
main: binding port with default address family
main: HTTP server is listening, hostname: 127.0.0.1, port: 8080, http threads: 23
main: loading model
srv load_model: loading model 'models/7B/ggml-model-f16.gguf'
gguf_init_from_file: failed to open GGUF file 'models/7B/ggml-model-f16.gguf'
llama_model_load: error loading model: llama_model_loader: failed to load model from models/7B/ggml-model-f16.gguf
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model 'models/7B/ggml-model-f16.gguf'
srv load_model: failed to load model, 'models/7B/ggml-model-f16.gguf'
srv operator(): operator(): cleaning up before exit...
main: exiting due to model loading error
.cache/llama.cpp/ contents
❯ ls ~/.cache/llama.cpp/
manifest=unsloth_Qwen3-Coder-30B-A3B-Instruct-GGUF=latest.json
manifest=unsloth_Qwen3-Coder-30B-A3B-Instruct-GGUF=Q4_K_XL.json
unsloth_Qwen3-Coder-30B-A3B-Instruct-GGUF_Qwen3-Coder-30B-A3B-Instruct-Q4_K_M.gguf
unsloth_Qwen3-Coder-30B-A3B-Instruct-GGUF_Qwen3-Coder-30B-A3B-Instruct-Q4_K_M.gguf.json
unsloth_Qwen3-Coder-30B-A3B-Instruct-GGUF_Qwen3-Coder-30B-A3B-Instruct-UD-Q4_K_XL.gguf
unsloth_Qwen3-Coder-30B-A3B-Instruct-GGUF_Qwen3-Coder-30B-A3B-Instruct-UD-Q4_K_XL.gguf.json
How do I find proper model name/path for llama-server -m command?
Can I change default model folder?
Can I list model names using llama-cli to use the name for the llama-server -m command?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello 👋🏻 Just started with llama.cpp and have a couple of questions. I downloaded a model using the
llama-cli -hf
like this according the instructions on the unsloth site.It runs, I can interact with it. But now I want to run it using
llama-server
to use it over http requests. I tried a couple of options with the model name but non of the worked.llama-server -m unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF:Q4_K_XL llama-server -m unsloth_Qwen3-Coder-30B-A3B-Instruct-GGUF_Qwen3-Coder-30B-A3B-Instruct-Q4_K_M.gguf # model name from the .cache/llama.cpp folder
Getting this error:
.cache/llama.cpp/
contentsHow do I find proper model name/path for
llama-server -m
command?Can I change default model folder?
Can I list model names using
llama-cli
to use the name for thellama-server -m
command?Beta Was this translation helpful? Give feedback.
All reactions