Skip to content

Conversation

@KeganHollern
Copy link
Contributor

@KeganHollern KeganHollern commented Dec 7, 2024

Summary

  1. Fixing a bug with systemRoleSupported for the OpenAI endpoint.
  2. Now using max_completion_tokens over max_tokens when querying chat_completions endpoint.

The bugfix is necessary as o1-preview and o1-mini models do not support "system" message types.
The token configuration change was necessary as o1-preview and o1-mini no longer support the, now deprecated, max_tokens field.

Considerations

I added a new endpoint configuration field called useCompletionTokens and defaulted it to true.
This would have the effect of changing the default chat_completions call to use max_completion_tokens over max_tokens.
There may be unintended consequences for users, as this could increase their overall token usage per query (as the default would change).
It may be prudent to start with this field set to false, and later switch to true as part of a planned deprecate of max_tokens all together.

Another consideration is that my change for system messages will leave a "role": "user", "content": "" entry. One thing we may want is to drop the system message entirely if content is blank and system messages are not supported.

Reference

For those wanting to introduce o1 reasoning models based on this PR (future readers), here is an example implementation of o1-mini:

{
      "name": "o1-mini",
      "displayName": "o1 mini test",
      "multimodal": false,
      "description": "ChatGPT o1-mini",
      "systemRoleSupported": false,
      "parameters": {
        "max_new_tokens": 2048,
      },
      "endpoints" : [{
        "type": "openai",
      }]
}

@KeganHollern KeganHollern marked this pull request as ready for review December 7, 2024 02:59
@evalstate
Copy link
Contributor

this is cool - i had a crack at this a while ago ( #1540 (comment) ) at which point streaming wasn't supported so i had to do some extra messing around (i capture token usage too).

@nsarrazin nsarrazin added enhancement New feature or request back This issue is related to the Svelte backend or the DB models This issue is related to model performance/reliability labels Dec 9, 2024
@nsarrazin
Copy link
Contributor

Hi! Thanks for the contrib. I made a couple changes:

  • Added an example in the doc
  • Made the parameter false by default (like you said, probably better to not change the default behaviour for now)
  • Fixed the linting

Overall this looks great, tried it locally and it seems to work well. Will merge once the checks are done. 🚀

@nsarrazin nsarrazin merged commit 9542b2c into huggingface:main Dec 9, 2024
4 checks passed
maksym-work pushed a commit to siilats/chat-ui that referenced this pull request Jul 2, 2025
* fix(openai): systemRoleSupported model configuration for openai endpoints

* feat(openai): max_completion_tokens now used over max_tokens for chat_completeions endpoint.

* fix: lint

* feat(docs): add o1 example

* fix: make parameter default to false and fix type checks

---------

Co-authored-by: Nathan Sarrazin <[email protected]>
Matsenas pushed a commit to Matsenas/chat-ui that referenced this pull request Jul 4, 2025
* fix(openai): systemRoleSupported model configuration for openai endpoints

* feat(openai): max_completion_tokens now used over max_tokens for chat_completeions endpoint.

* fix: lint

* feat(docs): add o1 example

* fix: make parameter default to false and fix type checks

---------

Co-authored-by: Nathan Sarrazin <[email protected]>
Matsenas pushed a commit to Matsenas/chat-ui that referenced this pull request Jul 4, 2025
* fix(openai): systemRoleSupported model configuration for openai endpoints

* feat(openai): max_completion_tokens now used over max_tokens for chat_completeions endpoint.

* fix: lint

* feat(docs): add o1 example

* fix: make parameter default to false and fix type checks

---------

Co-authored-by: Nathan Sarrazin <[email protected]>
gary149 pushed a commit to gary149/chat-ui that referenced this pull request Aug 29, 2025
* fix(openai): systemRoleSupported model configuration for openai endpoints

* feat(openai): max_completion_tokens now used over max_tokens for chat_completeions endpoint.

* fix: lint

* feat(docs): add o1 example

* fix: make parameter default to false and fix type checks

---------

Co-authored-by: Nathan Sarrazin <[email protected]>
gary149 pushed a commit to gary149/chat-ui that referenced this pull request Aug 29, 2025
* fix(openai): systemRoleSupported model configuration for openai endpoints

* feat(openai): max_completion_tokens now used over max_tokens for chat_completeions endpoint.

* fix: lint

* feat(docs): add o1 example

* fix: make parameter default to false and fix type checks

---------

Co-authored-by: Nathan Sarrazin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

back This issue is related to the Svelte backend or the DB enhancement New feature or request models This issue is related to model performance/reliability

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants