-
Notifications
You must be signed in to change notification settings - Fork 1.7k
[feat]: support logit_bias #5354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
0268d72
to
147fb84
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
A concise description of the purpose of the PR, followed by summarized bullets of changes
- Add support for
logit_bias
by integrating a new logits processor into sampling parameters - Remove the old validator blocking
logit_bias
- Define
LogitBiasLogitsProcessor
to apply per-token biases at generation time
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
File | Description |
---|---|
tensorrt_llm/serve/openai_protocol.py | Integrated logit_bias into both completion and chat sampling |
tensorrt_llm/sampling_params.py | Added LogitBiasLogitsProcessor class and updated imports to include Dict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for contribution! I leave a few comment about LogitBiasLogitsProcessor
's implementation, we also need to add tests, vLLM has tests for chat and for completion, these could be your reference. Ping me if you need any help.
7196cd0
to
5506bea
Compare
@xq25478 Can you please sign off your commits following the steps here. And I think it is acceptable to put the logit processor implementation in |
Add @netanel-haber as Nave suggested. |
2fd2d0e
to
f3c7a0b
Compare
test code has been added. |
f3c7a0b
to
9a4f0b9
Compare
7e4e5dd
to
533f5b6
Compare
775735c
to
b5a07f4
Compare
/bot run |
PR_Github #11549 [ run ] triggered by Bot |
PR_Github #11549 [ run ] completed with state |
@xq25478 would you mind rebasing this (looks it can't be auto-rebased with main). Thanks! |
/bot run |
PR_Github #12710 [ run ] completed with state |
/bot run |
PR_Github #12761 [ run ] triggered by Bot |
PR_Github #12761 [ run ] completed with state |
/bot run |
PR_Github #12817 [ run ] triggered by Bot |
PR_Github #12817 [ run ] completed with state |
/bot run |
PR_Github #12834 [ run ] triggered by Bot |
PR_Github #12834 [ run ] completed with state |
![]() @xq25478 |
/bot run |
PR_Github #12898 [ run ] triggered by Bot |
PR_Github #12898 [ run ] completed with state |
done |
/bot run |
PR_Github #12961 [ run ] triggered by Bot |
PR_Github #12961 [ run ] completed with state |
/bot skip --comment "Previous CI passed" |
PR_Github #12980 [ skip ] triggered by Bot |
PR_Github #12980 [ skip ] completed with state |
Signed-off-by: xq25478 <[email protected]> Signed-off-by: Venky Ganesh <[email protected]> Signed-off-by: hexiao.xq <[email protected]> Co-authored-by: Venky Ganesh <[email protected]> Co-authored-by: hexiao.xq <[email protected]> Co-authored-by: Pengyun Lin <[email protected]> Signed-off-by: Shreyas Misra <[email protected]>
Signed-off-by: xq25478 <[email protected]> Signed-off-by: Venky Ganesh <[email protected]> Signed-off-by: hexiao.xq <[email protected]> Co-authored-by: Venky Ganesh <[email protected]> Co-authored-by: hexiao.xq <[email protected]> Co-authored-by: Pengyun Lin <[email protected]> Signed-off-by: Ransiki Zhang <[email protected]>
Signed-off-by: xq25478 <[email protected]> Signed-off-by: Venky Ganesh <[email protected]> Signed-off-by: hexiao.xq <[email protected]> Co-authored-by: Venky Ganesh <[email protected]> Co-authored-by: hexiao.xq <[email protected]> Co-authored-by: Pengyun Lin <[email protected]> Signed-off-by: Lanyu Liao <[email protected]>
feat(openai protocol):support logitbias
Summary by CodeRabbit
logit_bias
parameter in both chat and completion APIs, allowing users to influence token generation by biasing specific tokens.logit_bias
in chat completions.logit_bias
inputs in chat and completion APIs.