forked from ggml-org/llama.cpp
-
Notifications
You must be signed in to change notification settings - Fork 527
Closed
Description
Hi,
After upgrading to 1.78 today, I can't load mixtral-based 8x7b models anymore.
Other models such as 30b/70b llama-type models work.
I get the same error whether I use vulkan or CLBlast, and with different models that also have different quantizations. (one q8_0, the other q6_m)
The error reads :
llama_model_load: error loading model: missing tensor 'blk.0.ffn_down_exps.weight'
llama_load_model_from_file: failed to load model
Traceback (most recent call last):
File "koboldcpp.py", line 4720, in <module>
main(parser.parse_args(),start_server=True)
File "koboldcpp.py", line 4344, in main
loadok = load_model(modelname)
File "koboldcpp.py", line 900, in load_model
ret = handle.load_model(inputs)
OSError: exception: access violation reading 0x00000000000018A4
[17628] Failed to execute script 'koboldcpp' due to unhandled exception!
Previous versions of KoboldCPP worked with those same models without a problem.
After reverting, can confirm 1.77 works.
Both are "cu12" versions (I still use CUDA for smaller models).
System has 64 GB RAM, 16GB VRAM (3080Ti laptop), Windows 11
Thanks in advance,
Asherathe
Metadata
Metadata
Assignees
Labels
No labels