Llama 4 Support? #6831
Replies: 2 comments
-
|
They did add it so if you update to latest you'll be able to run Llama 4 Scout. I'm currently running a Q4_K_M (8K Ctx) with 40GB of VRAM and 64GB of RAM. It'll be slower but you should be able to load it with what you've got. Here's my split to give idea for the sizes: llama_context: KV self size = 816.00 MiB, K (q8_0): 408.00 MiB, V (q8_0): 408.00 MiB |
Beta Was this translation helpful? Give feedback.
-
|
I downloaded the FP16 model from here: and it won't even start to load: Are you using the development branch or something? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Do you plan to support Llama 4? Please!
I have 16GB VRAM and a 13900K (32 core) with 128GB RAM - would that be enough to run quantized Llama 4 Scout in CPU+GPU?
Beta Was this translation helpful? Give feedback.
All reactions