Skip to content

Conversation

Originalimoc
Copy link

@Originalimoc Originalimoc commented Aug 1, 2025

Adds logic to prepend '<think>' to the first streamed chunk and all final generations if the chat template ends with 'think'. Adjusts token and offset accounting to remain consistent when the tag is injected.

Is your pull request related to a problem? Please describe.
Close issue #361, now you can happily force thinking on all supported models.

Adds logic to prepend '<think>' to the first streamed chunk and all final generations if the chat template ends with 'think'. Adjusts token and offset accounting to remain consistent when the tag is injected.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant