Skip to content

Community contribution: Adding GGUF support for more architectures #33260

Open
@SunMarc

Description

@SunMarc

Feature request

Recently, we have added the ability to load gguf files within transformers.

The goal was to offer the possibility to users to further train/fine-tune their gguf models.

See Workflow 1) Load gguf file in transformers: we dequantize the weights to fp32, then we load the weights to be used with PyTorch.
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF"
filename = "tinyllama-1.1b-chat-v1.0.Q6_K.gguf"

tokenizer = AutoTokenizer.from_pretrained(model_id, gguf_file=filename)
model = AutoModelForCausalLM.from_pretrained(model_id, gguf_file=filename)
  1. train/finetune

  2. Convert the model back to gguf to use in the ggml ecosystem using convert_hf_to_gguf script or using gguf-my-repo space if you pushed your model on the hub :

tokenizer.save_pretrained('directory')
model.save_pretrained('directory')

!python ${path_to_llama_cpp}/convert-hf-to-gguf.py ${directory}

Let's try to add GGUF support for more architectures! Currently supported architectures are

  • Llama
  • Mistral
  • Qwen2

It would be great to add the support for more architectures such as

... and many more (Feel free to suggest more architectures ! The model needs to integrated in transformers)

Adding this feature would require to follow the same protocol as in this PR :

  1. Update GGUF_TENSOR_MAPPING and GGUF_CONFIG_MAPPING in order to map the tensor/config of the gguf file to the one on transformers.
  2. Create a GGUFXXXConverter(XXXConverter) class to convert the gguf tokenizer to a transformers one.
  3. Write tests

If you are interested to take up the challenge, comment below with the architecture name you want to integrate and open a PR!

Once you open a PR, feel free to ping @SunMarc @LysandreJik @ArthurZucker for a review !

Motivation

Support for more gguf models

Your contribution

Reviewing PRs and possibly adding the support for more models

Metadata

Metadata

Assignees

No one assigned

    Labels

    Feature requestRequest for a new featureGood Second IssueIssues that are more difficult to do than "Good First" issues - give it a try if you want!

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions