Skip to content

Inference Client chat completion parameter logit_bias not working #2720

@joetaylor94

Description

@joetaylor94

Describe the bug

According to the docs, the logit_bias parameter for the chat_completion function expects a "JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100". The type annotation, however says that it should be an Optional[List[float]].

Indeed, if I try to pass in a dictionary, e.g.

completion = client.chat_completion( model="meta-llama/Llama-3.3-70B-Instruct", messages=messages, max_tokens=100, logit_bias={100: 4} )

I get an HTTPError: 422 Client Error: Unprocessable Entity for url error. I can pass in a list of floats, but have no idea how this is supposed to encode logit biases without a mapping.

Reproduction

from huggingface_hub import InferenceClient

client = InferenceClient(api_key="hf_xxx")

messages = [
	{
		"role": "user",
		"content": "The capital of France is"
	}
]

completion = client.chat_completion(
    model="meta-llama/Llama-3.3-70B-Instruct", 
	  messages=messages,
	  max_tokens=20,
    logit_bias={100: 4}
)

print(completion.choices[0].message)

Logs

No response

System info

N/A

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions