Skip to content

Allow arbitrary-sized images by dynamic masking: upstream changes from Swin-Transformer-Object-Detection / SOLQ #13

@vadimkantorov

Description

@vadimkantorov

Hi!

To combine Swin transformer backbone with Deformable DETR detector, SOLQ did some changes to swin_transformer.py that allow to compute the padding mask dynamically and allow for arbitrary-sized images in input (I think this is supported for relative positional encoding only).

Similar edits were done by your colleagues in https://github.com/SwinTransformer/Swin-Transformer-Object-Detection/blob/master/mmdet/models/backbones/swin_transformer.py

If this interests you, maybe you could import those edits from SOLQ / Swin-Transformer-Object-Detection or implement similar edits. This will make it simpler to experiment with SimMIM checkpoints / backbone code in object detection context and make sure that checkpoints load correctly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions