Allow "tiff" and more extensions in `DetectionDataset.from_yolo` function #1636

patel-zeel · 2024-10-31T14:03:23Z

Description

Addresses #1554 as discussed with @LinasKo.

Highlights

List any dependencies that are required for this change.

Pillow

Type of change

Please delete options that are not relevant.

New feature (non-breaking change which adds functionality)

How has this change been tested, please provide a testcase or example of how you tested the change?

The changes are tested with this colab notebook.

Test Case

100 images of each extension were generated along with their corresponding dummy labels and dummy data.yml. Test run checks if the function works for a particular extension and computes time taken to load the dataset.

Time taken (in seconds) to run the test case

Columns 0,1,2,3,4 are seed numbers.
-1.11 value indicates that supervision doesn't support that extension yet.

Result for `supervision` version `0.24.0`, which uses `cv2` to check the shape of images.

	0	1	2	3	4	Mean	Std
bmp	-1.11	-1.11	-1.11	-1.11	-1.11	-1.11	0.00
jpg	0.98	1.16	0.85	0.76	0.77	0.90	0.15
mpo	-1.11	-1.11	-1.11	-1.11	-1.11	-1.11	0.00
png	0.58	0.56	0.42	0.4	0.41	0.48	0.08
tif	-1.11	-1.11	-1.11	-1.11	-1.11	-1.11	0.00
webp	-1.11	-1.11	-1.11	-1.11	-1.11	-1.11	0.00
dng	-1.11	-1.11	-1.11	-1.11	-1.11	-1.11	0.00

Result for this PR, which uses `Pillow` to check the shape and type (RGB or not) of images.

	0	1	2	3	4	Mean	Std
bmp	0.04	0.02	0.02	0.02	0.03	0.03	0.01
jpg	0.03	0.02	0.03	0.02	0.02	0.03	0.00
mpo	0.04	0.04	0.06	0.04	0.04	0.04	0.01
png	0.02	0.02	0.04	0.02	0.02	0.02	0.01
tif	0.05	0.06	0.08	0.05	0.05	0.06	0.01
webp	0.10	0.11	0.11	0.09	0.11	0.10	0.00
dng	0.06	0.06	0.05	0.06	0.06	0.06	0.00

Docs

Docs updated? What were the changes: Not changed, please suggest if we need changes in the docs

LinasKo · 2024-10-31T14:12:59Z

You never cease to surprise, @patel-zeel; such a thorough analysis! ⭐

Adding this to the back of my review backlog, but I already know it will be a delight.

patel-zeel · 2024-11-01T07:55:10Z

Thank you for your kind words, @LinasKo. It's a pleasure to help improve a widely used library. Looking forward to your feedback!

patel-zeel · 2024-12-31T09:56:28Z

@LinasKo a small correction. I recently found that Ultralytics uses this function to read pfm and other image files during yolo predict call. It uses cv2 and not Pillow. So, I was able to process pfm images with Ultralytics. However, the fact remains that Pillow does not support colored pfm images. This PR added greyscale pfm support in Pillow and it points to this issue for not supporting colored pfm images. The issue is pinned on Pillow since May 2016 and thus, looks like it is not going to be resolved in the near future. However, unless pfm format dominates some domains, we are not bottlenecked by it.

…zeel/supervision into feat/expand-image-formats

SkalskiP · 2025-01-08T09:07:36Z

supervision/dataset/formats/yolo.py

-        h, w, _ = image.shape
+        w, h = image.size
        resolution_wh = (w, h)
+        if image.mode != "RGB":


@patel-zeel looks like we can simplify the code here. There is no need for nested ifs.

if image.mode not in ("RGB", "L"): raise ValueError( f"Images must be 'RGB' or 'grayscale', but {image_path} mode is '{image.mode}'." )

SkalskiP · 2025-01-08T09:09:36Z

supervision/dataset/formats/yolo.py

        resolution_wh = (w, h)
+        if image.mode != "RGB":
+            if image.mode == "L":
+                image = image.convert("RGB")


It seems to me that conversion to RGB is not necessary. The image is not used in the further part of the function.

Right, that'd save us the time wasted in the conversion. I checked the other extreme as well. If we convert all images to RGB by default, that adds a bit of unnecessary overhead. So, this change will improve the speed further. Thank you for pointing it out, @SkalskiP.

SkalskiP · 2025-01-08T09:15:33Z

supervision/dataset/formats/yolo.py

        str(path)
        for path in list_files_with_extensions(
-            directory=images_directory_path, extensions=["jpg", "jpeg", "png"]
+            directory=images_directory_path, extensions=["*"]


There is a small but important side-effect of this change. If there are other files than images in the directory, we will also try to load them. For example, macOS puts a .DS_Store file in the directory. @patel-zeel I suggest putting here a list of image extensions that you have tested.

I agree. On it!

SkalskiP · 2025-01-08T09:25:59Z

@patel-zeel I apologize for making you wait so long for a review of this PR. I was on a long break at the turn of the year. I left some comments, please address them.

…zeel/supervision into feat/expand-image-formats

patel-zeel · 2025-01-08T09:34:48Z

No worries, @SkalskiP, I hope you had a good time during the break. Thank you for the review! I have addressed the comments and applied the changes.

SkalskiP · 2025-01-08T10:37:08Z

supervision/dataset/formats/yolo.py

        for path in list_files_with_extensions(
-            directory=images_directory_path, extensions=["jpg", "jpeg", "png"]
+            directory=images_directory_path,
+            extensions=[


@patel-zeel small request: don't break file extensions onto separate lines; list them all on one line if possible

I kept them in a single line @SkalskiP. Pre-commit (specifically ruff-format) forced them on separate lines.

eeeh that's what I thought so

SkalskiP · 2025-01-08T10:37:36Z

@patel-zeel one more small comment and we should be good to go

SkalskiP · 2025-01-08T11:42:50Z

@patel-zeel, thanks a lot! 🙏🏻 approved and merging!

patel-zeel · 2025-01-09T03:51:35Z

Thank you for the review and merge, @SkalskiP!

I'd like to mention that this PR not only extends support for additional image extensions but also significantly accelerates data loading in the sv.DetectionDataset.from_yolo function. The speed up is approximately 14x (or 1300%) as demonstrated in this colab.

SkalskiP · 2025-01-09T09:26:29Z

Hi @patel-zeel 👋🏻 yes I know! in fact this is the main angle I plan to use when promoting this change.

replace cv2 with PIL for speed

f4850b8

LinasKo added the hacktoberfest-accepted Contribute to the notion of open-source this October! label Oct 31, 2024

Merge branch 'roboflow:develop' into feat/expand-image-formats

512edd0

patel-zeel added 2 commits January 7, 2025 22:16

expand support to grayscale similar to cv2

2ed66bf

Merge branch 'feat/expand-image-formats' of https://github.com/patel-…

3415c15

…zeel/supervision into feat/expand-image-formats

patel-zeel requested review from SkalskiP and onuralpszr as code owners January 7, 2025 16:46

Merge branch 'roboflow:develop' into feat/expand-image-formats

572ea32

onuralpszr self-assigned this Jan 7, 2025

patel-zeel mentioned this pull request Jan 7, 2025

Include additional file format in load_yolo_annotations #1769

Closed

1 task

SkalskiP requested changes Jan 8, 2025

View reviewed changes

patel-zeel added 2 commits January 8, 2025 15:02

explicit extensions, simplified code, speed improvement

d33c159

Merge branch 'feat/expand-image-formats' of https://github.com/patel-…

6aedf48

…zeel/supervision into feat/expand-image-formats

SkalskiP requested changes Jan 8, 2025

View reviewed changes

SkalskiP approved these changes Jan 8, 2025

View reviewed changes

SkalskiP merged commit 651ed9d into roboflow:develop Jan 8, 2025
21 checks passed

patel-zeel mentioned this pull request Jan 19, 2025

Allow TIFF (and more) image formats in load_yolo_annotations #1554

Closed

2 tasks

Allow "tiff" and more extensions in DetectionDataset.from_yolo function #1636

Allow "tiff" and more extensions in DetectionDataset.from_yolo function #1636

Uh oh!

Conversation

patel-zeel commented Oct 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Highlights

Type of change

How has this change been tested, please provide a testcase or example of how you tested the change?

Test Case

Time taken (in seconds) to run the test case

Result for supervision version 0.24.0, which uses cv2 to check the shape of images.

Result for this PR, which uses Pillow to check the shape and type (RGB or not) of images.

Docs

Uh oh!

LinasKo commented Oct 31, 2024

Uh oh!

patel-zeel commented Nov 1, 2024

Uh oh!

patel-zeel commented Dec 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SkalskiP commented Jan 8, 2025

Uh oh!

patel-zeel commented Jan 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

patel-zeel Jan 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SkalskiP commented Jan 8, 2025

Uh oh!

SkalskiP commented Jan 8, 2025

Uh oh!

Uh oh!

patel-zeel commented Jan 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SkalskiP commented Jan 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Allow "tiff" and more extensions in `DetectionDataset.from_yolo` function #1636

Allow "tiff" and more extensions in `DetectionDataset.from_yolo` function #1636

patel-zeel commented Oct 31, 2024 •

edited

Loading

Result for `supervision` version `0.24.0`, which uses `cv2` to check the shape of images.

Result for this PR, which uses `Pillow` to check the shape and type (RGB or not) of images.

patel-zeel commented Dec 31, 2024 •

edited

Loading

patel-zeel commented Jan 8, 2025 •

edited

Loading

patel-zeel Jan 8, 2025 •

edited

Loading

patel-zeel commented Jan 9, 2025 •

edited

Loading