Skip to content

Conversation

francoishernandez
Copy link
Member

@francoishernandez francoishernandez commented Jan 30, 2025

NLLB support has been unclear since switching from OpenNMT to Eole.
This PR facilitates the conversion of official HF NLLB models (e.g. https://huggingface.co/facebook/nllb-200-distilled-1.3B).
This should also facilitate re-enabling support of pre-trained seq2seq models such as BART or T5.

Note: this PR also re-enables phi-3x model conversion, which was broken by pixtral related changes in #153

Some notes for future work

The current structure of convert_HF needs to be reviewed to better support the encoder/decoder duality.
#156 was a first major step in making convert_HF, more modular, and #153 introduced the support of encoder keys, but now we need to meld all this into a more robust logic.
Also, we should probably define a better "HF settings deduction waterfall" in build_config_dict, but I'm not sure there is any centralized repository of all the possible values, as they are defined per model.

@francoishernandez francoishernandez changed the title [WIP] Enable HF nllb conversion Enable HF nllb conversion Jan 31, 2025
@francoishernandez francoishernandez merged commit 0b48259 into main Mar 10, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant