-
Notifications
You must be signed in to change notification settings - Fork 30k
BLIPs clean-up #35560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BLIPs clean-up #35560
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
down to have this still, but up to you it's been a while sorry for my late review
+1, should have this so we can remove deprecated code. I will resolve conflicts and see if anything else needs to be done |
run-slow: blip_2, instructblip, instructblipvideo |
This comment contains run-slow, running the specified jobs: models: ['models/blip_2', 'models/instructblip', 'models/instructblipvideo'] |
# Add imports via `define_import_structure` after the #35167 as we remove explicit import in `__init__.py` | ||
from transformers.utils.import_utils import define_import_structure | ||
|
||
reversed_structure = {} | ||
new_imported_modules_from_import_structure = define_import_structure("src/transformers/__init__.py") | ||
for mapping in new_imported_modules_from_import_structure.values(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why this fails only after few months after __init__ refactor
, prob no-one updated tiny_model_summary.json
after that
Without it the script fails to fetch tests that are modified in tiny_model_summary.json
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @ydshieh !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, the tiny mdoel creation script is failing (on github) and it has been sometime no one (i.e. me) updating the relevant stuff in the codebase.
Will try to work on that
run-slow: blip_2, instructblip, instructblipvideo |
This comment contains run-slow, running the specified jobs: models: ['models/blip_2', 'models/instructblip', 'models/instructblipvideo'] |
run-slow: blip_2, instructblip, instructblipvideo |
This comment contains run-slow, running the specified jobs: models: ['models/blip_2', 'models/instructblip', 'models/instructblipvideo'] |
if input_ids is None: | ||
special_image_mask = inputs_embeds == self.get_input_embeddings()( | ||
torch.tensor(self.config.image_token_id, dtype=torch.long, device=inputs_embeds.device) | ||
) | ||
special_image_mask = special_image_mask.all(-1) | ||
else: | ||
special_image_mask = input_ids == self.config.image_token_id | ||
|
||
special_image_mask = special_image_mask.unsqueeze(-1).expand_as(inputs_embeds).to(language_model_inputs.device) | ||
language_model_inputs = language_model_inputs.to(inputs_embeds.device, inputs_embeds.dtype) | ||
inputs_embeds = inputs_embeds.to(language_model_inputs.device).masked_scatter( | ||
special_image_mask, language_model_inputs | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we have a small function for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, we can isolate the special_image_mask
obtaining step in a tiny finction, it is a recurring pattern in all VLMs after the last big PR I merged for supporting inputs embeds
Taking it as a note for subsequent PR
# Add imports via `define_import_structure` after the #35167 as we remove explicit import in `__init__.py` | ||
from transformers.utils.import_utils import define_import_structure | ||
|
||
reversed_structure = {} | ||
new_imported_modules_from_import_structure = define_import_structure("src/transformers/__init__.py") | ||
for mapping in new_imported_modules_from_import_structure.values(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @ydshieh !
[For maintainers] Suggested jobs to run (before merge) run-slow: blip_2, instructblip, instructblipvideo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM otherwise, but happy if we isolate this in a function along with this PR!
Let's merge it then, I will make a PR to isolate |
@@ -626,7 +626,7 @@ | |||
"model_classes": [ | |||
"Blip2ForConditionalGeneration" | |||
], | |||
"sha": "35e1ef43da3554af62eb29a7b3dbbef3f3bef48e" | |||
"sha": "d0de11fd1f8ca481231c07ee0934924be96cb281" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops ....
What does this PR do?
Continuation of #34502, which I separated out because BLIP wasn't ready to be merged in v4.48 release. The main reason is
Blip2Retrieval
model whose hub configs weren't updatedMain changes:
if/else
checks and leave only the new logic with expanded inputsBlip2Retrieval
model class to accept expanded inputs (open to feedback). Currently it expects non-expanded inputs to do ITM so we just crop the inputs removing extra image palceholders