Skip to content

RoIHeads.postprocess_detections boxes slicing error occurs when removing predictions with the background label #9110

Open
@FeiFanMoKe

Description

@FeiFanMoKe

🐛 Describe the bug

Bug Report: Incorrect Box Slicing in Faster R-CNN's postprocess_detections

Minimal Reproduction Code

import torch
import torchvision

detector = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)
data = torch.zeros((1, 3, 1080, 1920), dtype=torch.float32)
detections = detector(data)

Description

The bug occurs in roi_heads.py (line 701) in the postprocess_detections function of RoIHeads when processing Faster R-CNN outputs. The current implementation incorrectly handles box dimension slicing when removing background class predictions.

Problem Location

The problematic code segment:

for boxes, scores, image_shape in zip(pred_boxes_list, pred_scores_list, image_shapes):
    ...
    # remove predictions with the background label
    boxes = boxes[:, 1:]  # Incorrect slicing
    scores = scores[:, 1:]
    labels = labels[:, 1:]
    ...

Root Cause

  1. The boxes tensor has shape [N, num_classes * 4] (where each class has 4 coordinate values)
  2. The current slicing boxes[:, 1:] incorrectly operates on the last dimension (class*coordinates) instead of just the class dimension
  3. This causes misalignment between boxes, scores, and labels since they're being sliced differently

Image

Expected Behavior

The boxes tensor should first be reshaped to [N, num_classes, 4] before slicing to properly separate class and coordinate dimensions.

Proposed Fix

for boxes, scores, image_shape in zip(pred_boxes_list, pred_scores_list, image_shapes):
    ...
    # remove predictions with the background label
    boxes = boxes.reshape(-1, num_classes, 4)  # Proper dimension separation
    boxes = boxes[:, 1:, :]  # Correct class dimension slicing
    scores = scores[:, 1:]
    labels = labels[:, 1:]
    ...

Impact

The current implementation leads to:

  1. Misaligned boxes and their corresponding scores/labels
  2. Potentially incorrect final detection results
  3. Silent failure without explicit errors

Versions

branch: 6473b77

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions