Llama 4 Support? #6831

CalculonPrime · 2025-04-06T06:15:13Z

CalculonPrime
Apr 6, 2025

Do you plan to support Llama 4? Please!

I have 16GB VRAM and a 13900K (32 core) with 128GB RAM - would that be enough to run quantized Llama 4 Scout in CPU+GPU?

Firworksyt · 2025-04-28T03:28:08Z

Firworksyt
Apr 28, 2025

They did add it so if you update to latest you'll be able to run Llama 4 Scout. I'm currently running a Q4_K_M (8K Ctx) with 40GB of VRAM and 64GB of RAM. It'll be slower but you should be able to load it with what you've got.

Here's my split to give idea for the sizes:
load_tensors: offloaded 26/49 layers to GPU
load_tensors: CUDA1_Split model buffer size = 995.27 MiB
load_tensors: CUDA0_Split model buffer size = 1790.86 MiB
load_tensors: CPU model buffer size = 28747.16 MiB
load_tensors: CUDA0 model buffer size = 19355.98 MiB
load_tensors: CUDA1 model buffer size = 10875.35 MiB
load_tensors: CPU model buffer size = 554.94 MiB

llama_context: KV self size = 816.00 MiB, K (q8_0): 408.00 MiB, V (q8_0): 408.00 MiB
llama_context: CUDA0 compute buffer size = 1213.91 MiB
llama_context: CUDA1 compute buffer size = 88.01 MiB
llama_context: CUDA_Host compute buffer size = 42.01 MiB

0 replies

CalculonPrime · 2025-05-19T20:33:45Z

CalculonPrime
May 19, 2025
Author

I downloaded the FP16 model from here:
https://huggingface.co/meta-llama/Llama-4-Scout-17B-16E

and it won't even start to load:

15:20:22-287635 INFO     Loading "meta-llama_Llama-4-Scout-17B-16E"
15:20:28-735339 ERROR    Failed to load the model.
Traceback (most recent call last):
  File "c:\AI\text-generation-webui\installer_files\env\Lib\site-packages\transformers\models\auto\configuration_auto.py", line 1113, in from_pretrained
    config_class = CONFIG_MAPPING[config_dict["model_type"]]
                   ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\AI\text-generation-webui\installer_files\env\Lib\site-packages\transformers\models\auto\configuration_auto.py", line 815, in __getitem__
    raise KeyError(key)
KeyError: 'llama4'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "c:\AI\text-generation-webui\modules\ui_model_menu.py", line 190, in load_model_wrapper
    shared.model, shared.tokenizer = load_model(selected_model, loader)
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\AI\text-generation-webui\modules\models.py", line 43, in load_model
    output = load_func_map[loader](model_name)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\AI\text-generation-webui\modules\models.py", line 83, in transformers_loader
    return load_model_HF(model_name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\AI\text-generation-webui\modules\transformers_loader.py", line 152, in load_model_HF
    config = AutoConfig.from_pretrained(path_to_model, trust_remote_code=shared.args.trust_remote_code)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\AI\text-generation-webui\installer_files\env\Lib\site-packages\transformers\models\auto\configuration_auto.py", line 1115, in from_pretrained
    raise ValueError(
ValueError: The checkpoint you are trying to load has model type `llama4` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

You can update Transformers with the command `pip install --upgrade transformers`. If this does not work, and the checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get the most up-to-date code by installing Transformers from source with the command `pip install git+https://github.com/huggingface/transformers.git`

Are you using the development branch or something?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Llama 4 Support? #6831

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Llama 4 Support? #6831

Uh oh!

CalculonPrime Apr 6, 2025

Replies: 2 comments

Uh oh!

Firworksyt Apr 28, 2025

Uh oh!

CalculonPrime May 19, 2025 Author

CalculonPrime
Apr 6, 2025

Firworksyt
Apr 28, 2025

CalculonPrime
May 19, 2025
Author