ViTDet object detection

### 🚀 The feature

[ViTDet](https://arxiv.org/abs/2203.16527) achieves very interesting results on COCO and, given that ViT is already implemented, it seems relatively straightforward to implement this in torchvision.

### Motivation, pitch

The best performing object detection network in torchvision is currently FasterRCNN with a resnet50 backbone ([46.7 mAP](https://pytorch.org/vision/stable/models/generated/torchvision.models.detection.fasterrcnn_resnet50_fpn_v2.html#torchvision.models.detection.FasterRCNN_ResNet50_FPN_V2_Weights)). [ViTDet](https://arxiv.org/pdf/2203.16527.pdf) reports an mAP 51.6 with ViT-B backbone, 55.6 with ViT-L and an impressive 56.7 mAP with ViT-H. Similarly impressive results have been obtained with the instance aware segmentation implementation.

### Alternatives

[Detectron2](https://github.com/facebookresearch/detectron2/tree/main/projects/ViTDet) implements ViTDet. It could be decided that torchvision will not provide its own implementation and instead redirects users that want to use ViTDet to Detectron2.

### Additional context

Implementing ViTDet opens the door to other implementations, such as [EVA-02](https://arxiv.org/abs/2303.11331). EVA-02 achieves even better results compared to ViTDet.

I have previously implemented [RetinaNet](https://github.com/pytorch/vision/pull/1697) for torchvision (later merged in https://github.com/pytorch/vision/pull/2784). I might be interested in implementing ViTDet, but I would first like to see if there is interest by the maintainers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ViTDet object detection #7630

🚀 The feature

Motivation, pitch

Alternatives

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ViTDet object detection #7630

Description

🚀 The feature

Motivation, pitch

Alternatives

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions