-
-
Notifications
You must be signed in to change notification settings - Fork 537
feat: optimized box sorting #587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the sorted_boxes function in the text detection module to use a more efficient NumPy-based implementation instead of nested loops with manual swapping.
- Replaces O(n²) bubble sort algorithm with NumPy vectorized operations for better performance
- Adds early return check for empty input arrays
- Uses composite sort key approach to group boxes by line ID and then by x-coordinate
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
SWHL
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
When I integrated this here and ran the unit tests, I found that your implementation is not fully equivalent. The main reason is that: y_order = np.argsort(dt_boxes[:, 0, 1], kind="stable")
sorted_y = dt_boxes[y_order, 0, 1]
line_ids = np.empty(len(dt_boxes), dtype=np.int32)
line_ids[0] = 0
np.cumsum(np.abs(np.diff(sorted_y)) >= _BOX_SORT_Y_THRESHOLD, out=line_ids[1:])
# Here
sort_key = line_ids[y_order] * _BOX_SORT_LINE_SEPARATION_FACTOR + dt_boxes[y_order, 0, 0]
final_order = np.argsort(sort_key, kind="stable")
return dt_boxes[y_order[final_order]]
I fixed it by the following: # Step 1: Stable sort by y (top to bottom)
y_coords = dt_boxes[:, 0, 1]
y_order = np.argsort(y_coords, kind="stable")
boxes_y_sorted = dt_boxes[y_order]
y_sorted = y_coords[y_order]
# Step 2: Assign line IDs based on adjacent y differences
dy = np.diff(y_sorted)
line_increments = (dy >= _BOX_SORT_Y_THRESHOLD).astype(np.int32)
line_ids = np.concatenate([[0], np.cumsum(line_increments)])
# Now, within each line_id group, sort by x (left to right)
x_coords = boxes_y_sorted[:, 0, 0]
final_order_in_y_sorted = np.lexsort((x_coords, line_ids))
# Reorder the y-sorted boxes
return boxes_y_sorted[final_order_in_y_sorted] |
|
Benchmark: test.py import random
import timeit
import numpy as np
# ----------------------------
# 配置参数
# ----------------------------
_BOX_SORT_Y_THRESHOLD = 10
# ----------------------------
# 方法1:NumPy 向量化实现(高效)
# ----------------------------
def sorted_boxes_numpy(dt_boxes: np.ndarray) -> np.ndarray:
"""
Equivalent NumPy implementation of the original bubble-adjusted sort.
"""
if len(dt_boxes) == 0:
return dt_boxes
# Step 1: Stable sort by y (top to bottom)
y_coords = dt_boxes[:, 0, 1]
y_order = np.argsort(y_coords, kind="stable")
boxes_y_sorted = dt_boxes[y_order]
y_sorted = y_coords[y_order]
# Step 2: Assign line IDs based on adjacent y differences
dy = np.diff(y_sorted)
line_increments = (dy >= _BOX_SORT_Y_THRESHOLD).astype(np.int32)
line_ids = np.concatenate([[0], np.cumsum(line_increments)])
# Now, within each line_id group, sort by x (left to right)
x_coords = boxes_y_sorted[:, 0, 0]
final_order_in_y_sorted = np.lexsort((x_coords, line_ids))
return boxes_y_sorted[final_order_in_y_sorted]
# ----------------------------
# 方法2:Python 循环 + 局部冒泡(较慢)
# ----------------------------
def sorted_boxes_python(dt_boxes: np.ndarray) -> np.ndarray:
if len(dt_boxes) == 0:
return dt_boxes
# 转为 list of arrays for mutability
_boxes = [box.copy() for box in dt_boxes]
num_boxes = len(_boxes)
# 先按 (y, x) 排序
_boxes.sort(key=lambda x: (x[0][1], x[0][0]))
# 局部冒泡调整
for i in range(num_boxes - 1):
for j in range(i, -1, -1):
if (
abs(_boxes[j + 1][0][1] - _boxes[j][0][1]) < _BOX_SORT_Y_THRESHOLD
and _boxes[j + 1][0][0] < _boxes[j][0][0]
):
_boxes[j], _boxes[j + 1] = _boxes[j + 1], _boxes[j]
else:
break
return np.array(_boxes)
# ----------------------------
# 辅助函数:生成随机文本框
# 假设每个 box 是 [4, 2],我们只关心左上角 [0] = [x, y]
# ----------------------------
def generate_random_boxes(n: int, x_range=(0, 1000), y_range=(0, 500)) -> np.ndarray:
boxes = np.zeros((n, 4, 2), dtype=np.float32)
for i in range(n):
x = random.uniform(*x_range)
y = random.uniform(*y_range)
# 简化:所有点都设为左上角(不影响排序逻辑)
boxes[i, :, :] = [x, y]
return boxes
# ----------------------------
# 主测试函数
# ----------------------------
def benchmark():
sizes = [10, 50, 100, 500, 1000]
repeat = 5 # timeit.repeat 的次数
number = 10 # 每次重复执行 number 次
print(
f"{'Num Boxes':>10} | {'NumPy (ms)':>12} | {'Python (ms)':>12} | {'Speedup':>8}"
)
print("-" * 55)
for n in sizes:
boxes = generate_random_boxes(n)
boxes_np = boxes.copy()
boxes_py = boxes.copy()
# 测试 NumPy 版本
time_numpy = timeit.timeit(
lambda: sorted_boxes_numpy(boxes_np), number=number, globals=globals()
)
avg_numpy_ms = (time_numpy / number) * 1000
# 测试 Python 版本
time_python = timeit.timeit(
lambda: sorted_boxes_python(boxes_py), number=number, globals=globals()
)
avg_python_ms = (time_python / number) * 1000
speedup = avg_python_ms / avg_numpy_ms if avg_numpy_ms > 0 else float("inf")
print(
f"{n:>10} | {avg_numpy_ms:>12.3f} | {avg_python_ms:>12.3f} | {speedup:>7.1f}x"
)
if __name__ == "__main__":
benchmark()Run: python test.pyOutput: Num Boxes | NumPy (ms) | Python (ms) | Speedup
-------------------------------------------------------
10 | 0.076 | 0.013 | 0.2x
50 | 0.012 | 0.076 | 6.5x
100 | 0.018 | 0.173 | 9.5x
500 | 0.040 | 1.567 | 39.1x
1000 | 0.095 | 4.609 | 48.5x |
|
Nice catch! Glad to hear you solved it :) |
This is a small PR to make box sorting more efficient. It uses NumPy operations and ends up being about 9x faster as a result. I compared results using == for a few images with dozens of boxes and confirmed the outputs were the same.