-
Notifications
You must be signed in to change notification settings - Fork 30.1k
Delete deprecated stuff #38838
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delete deprecated stuff #38838
Changes from all commits
60b4cbe
98184f9
325cde8
26a9c28
68a501e
2a53654
1f81714
ca78f07
4c9bd33
fcbd79e
00dcc6d
c8b7099
86d470d
f19d166
ab7fac4
d9ee03f
f06327f
5080a86
31c5937
cb8be3c
7247eed
48fd132
37850e4
44e7125
d1f9915
c769d41
486b21d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,7 +15,7 @@ | |
|
||
import torch | ||
|
||
from ..cache_utils import DynamicCache, HybridCache, StaticCache | ||
from ..cache_utils import DynamicCache, EncoderDecoderCache, HybridCache, StaticCache | ||
from ..generation.configuration_utils import GenerationConfig | ||
from ..masking_utils import ( | ||
ALL_MASK_ATTENTION_FUNCTIONS, | ||
|
@@ -548,14 +548,15 @@ def __init__(self, model, max_static_cache_length, batch_size): | |
self.lm_head = model.lm_head | ||
self.config = model.config | ||
|
||
# Initialize static cache | ||
# Initialize static cache for decoder and DynamicCache for encoder | ||
self.static_cache = StaticCache( | ||
config=self.config, | ||
max_batch_size=batch_size, | ||
max_cache_len=max_static_cache_length, | ||
device="cpu", | ||
dtype=torch.float32, | ||
) | ||
self.cache = EncoderDecoderCache(self.static_cache, DynamicCache()) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This export recipe only uses the decoder part so it should not need this change no? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The model is encoder-decoder and needs cache for both. Prev, we would wrap static cache in Export doesn't really care about encoder cache indeed, it's needed for the generation code to be working There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, because it does cross attention and still would be looking for |
||
|
||
# Register cache buffers to make them exportable | ||
for i in range(len(self.static_cache.key_cache)): | ||
|
@@ -567,7 +568,7 @@ def forward(self, decoder_input_ids, encoder_hidden_states, cache_position): | |
outputs = self.decoder( | ||
input_ids=decoder_input_ids, | ||
encoder_hidden_states=encoder_hidden_states, | ||
past_key_values=self.static_cache, | ||
past_key_values=self.cache, | ||
use_cache=True, | ||
cache_position=cache_position, | ||
) | ||
|
Uh oh!
There was an error while loading. Please reload this page.