enforce pickleability for v2 transforms and wrapped datasets#7860
enforce pickleability for v2 transforms and wrapped datasets#7860pmeier merged 10 commits intopytorch:mainfrom
Conversation
NicolasHug
left a comment
There was a problem hiding this comment.
Thanks Philip, some comments below but LGTM if green.
Perhaps we'll want to add more tests w.r.t. multiprocessing_context="spawn"
|
With c68a6de On Meaning, we are adding roughly 1.5 minutes of test time. |
|
After 5358620 Meaning, we are down to roughly 30 seconds of extra time. |
|
After f339e6c while pretending I'm on macOS We are down to ~20 seconds of extra runtime compared to |
| def test_num_examples(self, config): | ||
| with self.create_dataset(config) as (dataset, info): | ||
| assert len(dataset) == info["num_examples"] | ||
| assert len(list(dataset)) == len(dataset) == info["num_examples"] |
There was a problem hiding this comment.
We never actually consumed the dataset before. Thus, any failures that happen not for the first sample are not detected. Fortunately, only one test was broken that I'll flag below.
| tmpdir = pathlib.Path(tmpdir) / "caltech256" / "256_ObjectCategories" | ||
|
|
||
| categories = ((1, "ak47"), (127, "laptop-101"), (257, "clutter")) | ||
| categories = ((1, "ak47"), (2, "american-flag"), (3, "backpack")) |
There was a problem hiding this comment.
datasets.Caltech relies on the fact that all categories are present. When actually consuming the dataset (see above), the old fake data setup falls flat. Our options are:
- Fix the dataset to account for gaps in the catgories.
- Create all categories as fakedata
- Create fakedata that starts at the first without any gaps, but not all categories.
Option 3. is by far the least amount of work, so I went for that here.
| "occlusion": labels_tensor[:, 7], | ||
| "pose": labels_tensor[:, 8], | ||
| "invalid": labels_tensor[:, 9], | ||
| "bbox": labels_tensor[:, 0:4].clone(), # x, y, width, height |
There was a problem hiding this comment.
Views on tensor cannot be pickled correctly. Meaning regardless of the v2 wrapper, datasets.Widerface has never worked in a spawn context.
|
Still LGTM, but the test duration is a bit unfortunate. I think we'd be better-off testing just 2-3 of these datasets (or even just 1 TBH, as one test is still infinitely better than zero). Also ideally we'd let them having |
…#7860) Summary: (Note: this ignores all push blocking failures!) Reviewed By: matteobettini Differential Revision: D48900370 fbshipit-source-id: be1b23dcab58d2a8b5bca7190f94c0123263d036
Fixes #6753 (comment).
picklea little for it to be able to deserialize the object. We only need to overwrite the__reduce__method. Again, we also add tests to enforce pickleability in the future.cc @vfdev-5