Skip to content

Ops to convert masks to boxes #3960

Closed
@oke-aditya

Description

@oke-aditya

🚀 Feature

A simple torchvision.ops to convert Segmentation masks to bounding boxes.

Motivation

This has a few use-cases.

  1. This makes it easier to use semantic segmentation datasets for object detection.
    The pipeline can be easier. Also the bounding boxes are represented as xyxy in torchvision.ops as a convention.
    So probably convert masks to xyxy format.

  2. The other use case is to make it easier in comparing performance of segmentation model vs detection model.
    Let's Say that the detection model performs well for segmentation dataset. Then it would be better to go ahead with detection models as it is faster in real-time use-cases than to train a segmentation model.

New Pipeline

from torchvision.ops import masks_to_boxes, box_convert

class SegmentationToDetectionDataset(Dataset):
    def __getitem__(self, idx):
          boxes_xyxy = masks_to_boxes(segmentation_masks)

         # Now for any change of boxes to COCO Format.
          boxes_xywh = box_convert(boxes_xyxy, in_fmt="xyxy", out_fmt="xywh")
          return boxes_xywh

Pitch

Port the masks_to_boxes function from mDeTR.

masks_to_boxes was also used in DeTR.

Alternatives

The above function assumes masks of shape (N, H, W) -> num_masks, Height, Width. A floating tensor.
IIRC, we used a boolean tensor in draw_segmentation_masks (After Nicolas refactored). So perhaps we should be using boolean tensor? Though I see no particular use case of this util being only valid for instance segmentation.

Additional context

I can port this, we perhaps need a few tests to ensure it works fine.
Especially test for float16 overflow.

cc @datumbox @NicolasHug

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions