-
-
Notifications
You must be signed in to change notification settings - Fork 10.3k
refactor: Turn GPUModelRunner.inputs_embeds to a CpuGpuBuffer #24345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Andrew Sansom <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors GPUModelRunner.inputs_embeds
to use the CpuGpuBuffer
class. This is achieved by extending CpuGpuBuffer
to optionally disable the creation of a NumPy view, which is necessary for bfloat16
tensors. The changes are consistent and correctly adapt the usage of inputs_embeds
throughout GPUModelRunner
. My main feedback is on improving the robustness of the CpuGpuBuffer
class to prevent potential runtime errors.
…roject#24345) Signed-off-by: Andrew Sansom <[email protected]> Signed-off-by: JasonZhu1313 <[email protected]>
…roject#24345) Signed-off-by: Andrew Sansom <[email protected]> Signed-off-by: JasonZhu1313 <[email protected]>
…roject#24345) Signed-off-by: Andrew Sansom <[email protected]>
…roject#24345) Signed-off-by: Andrew Sansom <[email protected]> Signed-off-by: LopezCastroRoberto <[email protected]>
…roject#24345) Signed-off-by: Andrew Sansom <[email protected]>
…roject#24345) Signed-off-by: Andrew Sansom <[email protected]> Signed-off-by: rogeryoungh <[email protected]>
…roject#24345) Signed-off-by: Andrew Sansom <[email protected]> Signed-off-by: bruceszchen <[email protected]>
…roject#24345) Signed-off-by: Andrew Sansom <[email protected]> Signed-off-by: bruceszchen <[email protected]>
Purpose
As requested by @DarkLight1337 in #24278 (comment), this PR refactors GPUModelRunner.inputs_embeds to use CpuGpuBuffer to reduce the total diff in #24278.
Test Plan
All relevant unit tests are passing locally for me. Pending CI, it should be fine. This change works in the full context of #24278.
Test Result
Pending CI.
Essential Elements of an Effective PR Description Checklist
supported_models.md
andexamples
for a new model.