Skip to content

Add fill parameter to utils.draw_bounding_boxes. #3280

Closed
@oke-aditya

Description

@oke-aditya

🚀 Feature

Fill parameter allows creating a semi-transparent box. This is particularly useful for Mask RCNN Model.
This would complete utils for Object detection and Instance Segmentation (least with rectangular boxes)

Motivation

In Instance segmentation models, we also care about masks, not just the bounding box. Fill parameter allows us to fill in a semi-transparent way. Also, this parameter is optional hence it does not affect performance.

Pitch

Add a param fill as follows

fill: Optional[List[Union[str, Tuple[int, int, int]]]] = None,

Here is complete running code with a few edits

@torch.no_grad()
def draw_bounding_boxes(
    image: torch.Tensor,
    boxes: torch.Tensor,
    labels: Optional[List[str]] = None,
    colors: Optional[List[Union[str, Tuple[int, int, int]]]] = None,
    fill: Optional[List[Union[str, Tuple[int, int, int]]]] = None,
    width: int = 1,
    font: Optional[str] = None,
    font_size: int = 10
) -> torch.Tensor:

    """
    Draws bounding boxes on given image.
    The values of the input image should be uint8 between 0 and 255.
    Args:
        image (Tensor): Tensor of shape (C x H x W)
        bboxes (Tensor): Tensor of size (N, 4) containing bounding boxes in (xmin, ymin, xmax, ymax) format. Note that
            the boxes are absolute coordinates with respect to the image. In other words: `0 <= xmin < xmax < W` and
            `0 <= ymin < ymax < H`.
        labels (List[str]): List containing the labels of bounding boxes.
        colors (List[Union[str, Tuple[int, int, int]]]): List containing the colors of bounding boxes. The colors can
            be represented as `str` or `Tuple[int, int, int]`.
        fill: Optional[List[Union[str, Tuple[int, int, int]]]] = None,
        width (int): Width of bounding box.
        font (str): A filename containing a TrueType font. If the file is not found in this filename, the loader may
            also search in other directories, such as the `fonts/` directory on Windows or `/Library/Fonts/`,
            `/System/Library/Fonts/` and `~/Library/Fonts/` on macOS.
        font_size (int): The requested font size in points.
    """

    if not isinstance(image, torch.Tensor):
        raise TypeError(f"Tensor expected, got {type(image)}")
    elif image.dtype != torch.uint8:
        raise ValueError(f"Tensor uint8 expected, got {image.dtype}")
    elif image.dim() != 3:
        raise ValueError("Pass individual images, not batches")

    ndarr = image.permute(1, 2, 0).numpy()
    img_to_draw = Image.fromarray(ndarr)

    img_boxes = boxes.to(torch.int64).tolist()

    draw = ImageDraw.Draw(img_to_draw, "RGBA")

    txt_font = ImageFont.load_default() if font is None else ImageFont.truetype(font=font, size=font_size)

    for i, bbox in enumerate(img_boxes):
        color = None if colors is None else colors[i]
        draw.rectangle(bbox, width=width, outline=color, fill=fill)

        if labels is not None:
            draw.text((bbox[0], bbox[1]), labels[i], fill=color, font=txt_font)

    return torch.from_numpy(np.array(img_to_draw)).permute(2, 0, 1)

This makes mask RCNN output more clear, and people can play with fill parameter such as confidence based fill, fill with colors different per class, etc.

Additional context

I can send PR for this 😅 I'm attaching outputs of above code.

draw_boxes_util2

(Sorry PyTorch logo 🙏 )

draw_boxes_util

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions