Skip to content

Conversation

@mertalev
Copy link
Contributor

@mertalev mertalev commented Nov 3, 2025

This is a small PR to make box sorting more efficient. It uses NumPy operations and ends up being about 9x faster as a result. I compared results using == for a few images with dozens of boxes and confirmed the outputs were the same.

@SWHL SWHL requested a review from Copilot November 3, 2025 11:30
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the sorted_boxes function in the text detection module to use a more efficient NumPy-based implementation instead of nested loops with manual swapping.

  • Replaces O(n²) bubble sort algorithm with NumPy vectorized operations for better performance
  • Adds early return check for empty input arrays
  • Uses composite sort key approach to group boxes by line ID and then by x-coordinate

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Member

@SWHL SWHL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SWHL SWHL merged commit 3f3fb13 into RapidAI:main Dec 2, 2025
@SWHL SWHL added this to the v3.5.0 milestone Dec 2, 2025
@SWHL
Copy link
Member

SWHL commented Dec 2, 2025

When I integrated this here and ran the unit tests, I found that your implementation is not fully equivalent. The main reason is that:

y_order = np.argsort(dt_boxes[:, 0, 1], kind="stable")
sorted_y = dt_boxes[y_order, 0, 1]

line_ids = np.empty(len(dt_boxes), dtype=np.int32)
line_ids[0] = 0
np.cumsum(np.abs(np.diff(sorted_y)) >= _BOX_SORT_Y_THRESHOLD, out=line_ids[1:])

# Here
sort_key = line_ids[y_order] * _BOX_SORT_LINE_SEPARATION_FACTOR + dt_boxes[y_order, 0, 0]
final_order = np.argsort(sort_key, kind="stable")
return dt_boxes[y_order[final_order]]

line_ids are assigned to the sequence already sorted by y (i.e., in the order corresponding to y_order). However, when constructing sort_key, you used line_ids[y_order], which effectively assigns incorrect line IDs to each box in the original unsorted order.

I fixed it by the following:

# Step 1: Stable sort by y (top to bottom)
y_coords = dt_boxes[:, 0, 1]
y_order = np.argsort(y_coords, kind="stable")
boxes_y_sorted = dt_boxes[y_order]
y_sorted = y_coords[y_order]

# Step 2: Assign line IDs based on adjacent y differences
dy = np.diff(y_sorted)
line_increments = (dy >= _BOX_SORT_Y_THRESHOLD).astype(np.int32)
line_ids = np.concatenate([[0], np.cumsum(line_increments)])

# Now, within each line_id group, sort by x (left to right)
x_coords = boxes_y_sorted[:, 0, 0]
final_order_in_y_sorted = np.lexsort((x_coords, line_ids))

# Reorder the y-sorted boxes
return boxes_y_sorted[final_order_in_y_sorted]

@SWHL
Copy link
Member

SWHL commented Dec 2, 2025

Benchmark:

test.py

import random
import timeit

import numpy as np

# ----------------------------
# 配置参数
# ----------------------------
_BOX_SORT_Y_THRESHOLD = 10


# ----------------------------
# 方法1:NumPy 向量化实现(高效)
# ----------------------------
def sorted_boxes_numpy(dt_boxes: np.ndarray) -> np.ndarray:
    """
    Equivalent NumPy implementation of the original bubble-adjusted sort.
    """
    if len(dt_boxes) == 0:
        return dt_boxes

    # Step 1: Stable sort by y (top to bottom)
    y_coords = dt_boxes[:, 0, 1]
    y_order = np.argsort(y_coords, kind="stable")
    boxes_y_sorted = dt_boxes[y_order]
    y_sorted = y_coords[y_order]

    # Step 2: Assign line IDs based on adjacent y differences
    dy = np.diff(y_sorted)
    line_increments = (dy >= _BOX_SORT_Y_THRESHOLD).astype(np.int32)
    line_ids = np.concatenate([[0], np.cumsum(line_increments)])

    # Now, within each line_id group, sort by x (left to right)
    x_coords = boxes_y_sorted[:, 0, 0]
    final_order_in_y_sorted = np.lexsort((x_coords, line_ids))

    return boxes_y_sorted[final_order_in_y_sorted]


# ----------------------------
# 方法2:Python 循环 + 局部冒泡(较慢)
# ----------------------------
def sorted_boxes_python(dt_boxes: np.ndarray) -> np.ndarray:
    if len(dt_boxes) == 0:
        return dt_boxes

    # 转为 list of arrays for mutability
    _boxes = [box.copy() for box in dt_boxes]
    num_boxes = len(_boxes)

    # 先按 (y, x) 排序
    _boxes.sort(key=lambda x: (x[0][1], x[0][0]))

    # 局部冒泡调整
    for i in range(num_boxes - 1):
        for j in range(i, -1, -1):
            if (
                abs(_boxes[j + 1][0][1] - _boxes[j][0][1]) < _BOX_SORT_Y_THRESHOLD
                and _boxes[j + 1][0][0] < _boxes[j][0][0]
            ):
                _boxes[j], _boxes[j + 1] = _boxes[j + 1], _boxes[j]
            else:
                break
    return np.array(_boxes)


# ----------------------------
# 辅助函数:生成随机文本框
# 假设每个 box 是 [4, 2],我们只关心左上角 [0] = [x, y]
# ----------------------------
def generate_random_boxes(n: int, x_range=(0, 1000), y_range=(0, 500)) -> np.ndarray:
    boxes = np.zeros((n, 4, 2), dtype=np.float32)
    for i in range(n):
        x = random.uniform(*x_range)
        y = random.uniform(*y_range)
        # 简化:所有点都设为左上角(不影响排序逻辑)
        boxes[i, :, :] = [x, y]
    return boxes


# ----------------------------
# 主测试函数
# ----------------------------
def benchmark():
    sizes = [10, 50, 100, 500, 1000]
    repeat = 5  # timeit.repeat 的次数
    number = 10  # 每次重复执行 number 次

    print(
        f"{'Num Boxes':>10} | {'NumPy (ms)':>12} | {'Python (ms)':>12} | {'Speedup':>8}"
    )
    print("-" * 55)

    for n in sizes:
        boxes = generate_random_boxes(n)
        boxes_np = boxes.copy()
        boxes_py = boxes.copy()

        # 测试 NumPy 版本
        time_numpy = timeit.timeit(
            lambda: sorted_boxes_numpy(boxes_np), number=number, globals=globals()
        )
        avg_numpy_ms = (time_numpy / number) * 1000

        # 测试 Python 版本
        time_python = timeit.timeit(
            lambda: sorted_boxes_python(boxes_py), number=number, globals=globals()
        )
        avg_python_ms = (time_python / number) * 1000

        speedup = avg_python_ms / avg_numpy_ms if avg_numpy_ms > 0 else float("inf")

        print(
            f"{n:>10} | {avg_numpy_ms:>12.3f} | {avg_python_ms:>12.3f} | {speedup:>7.1f}x"
        )


if __name__ == "__main__":
    benchmark()

Run:

python test.py

Output:

 Num Boxes |   NumPy (ms) |  Python (ms) |  Speedup
-------------------------------------------------------
        10 |        0.076 |        0.013 |     0.2x
        50 |        0.012 |        0.076 |     6.5x
       100 |        0.018 |        0.173 |     9.5x
       500 |        0.040 |        1.567 |    39.1x
      1000 |        0.095 |        4.609 |    48.5x

@mertalev
Copy link
Contributor Author

mertalev commented Dec 2, 2025

Nice catch! Glad to hear you solved it :)

@SWHL SWHL modified the milestones: v3.5.0, v3.4.3 Dec 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants