Adding Preset Transforms in reference scripts #3317

datumbox · 2021-01-28T11:25:42Z

Updating the following reference scripts:

Classification
Object Detection
Segmentation
Video Classification

The Similarity reference script was skipped because it's not a real recipe. A reference implementation for it can be seen at 71b7091.

fmassa

The proposed approach looks great to me, good to go for the other tasks as well!

codecov · 2021-01-28T13:29:35Z

Codecov Report

Merging #3317 (ba326de) into master (7621a8e) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #3317   +/-   ##
=======================================
  Coverage   73.93%   73.93%           
=======================================
  Files         104      104           
  Lines        9594     9594           
  Branches     1531     1531           
=======================================
  Hits         7093     7093           
  Misses       2024     2024           
  Partials      477      477

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7621a8e...ba326de. Read the comment docs.

fmassa

Changes for segmentation and video classification looks good to me as well!

Summary: * Adding presets in the classification reference scripts. * Adding presets in the object detection reference scripts. * Adding presets in the segmentation reference scripts. * Adding presets in the video classification reference scripts. * Moving flip at the end to align with image classification signature. Reviewed By: datumbox Differential Revision: D26226607 fbshipit-source-id: 965f54e18d01fce6c1225eb2b6bdea1e4efd3998

zhangguanheng66 · 2021-02-10T16:07:52Z

references/classification/presets.py

+class ClassificationPresetEval:
+    def __init__(self, crop_size, resize_size=256, mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)):
+
+        self.transforms = transforms.Compose([


Usually, for text domain, we will need to download the transform, for example sentencepiece model, or a vocabulary saved in text file.

I'm writing here what we discussed on the call.

It seems that supporting your case is possible by using PyTorch Hub's load_state_dict_from_url() method and then passing the result to your code. This is very common pattern in TorchVision, used mainly for pre-trained models. Example:

vision/torchvision/models/mobilenetv3.py

Lines 244 to 249 in 97885cb

model = MobileNetV3(inverted_residual_setting, last_channel, **kwargs)

if pretrained:

if model_urls.get(arch, None) is None:

raise ValueError("No checkpoint is available for model type {}".format(arch))

state_dict = load_state_dict_from_url(model_urls[arch], progress=progress)

model.load_state_dict(state_dict)

netw0rkf10w · 2021-05-11T19:04:57Z

@datumbox @fmassa Could you please tell me what was the original motivation of this PR? I couldn't find any information on Preset Transforms. Thanks a lot!

datumbox · 2021-05-11T19:31:23Z

@netw0rkf10w Providing the training and inference transforms of each model/pipeline in an organized way so that people can reproduce the models. Here is an example of complex presets: 1 and 2.

netw0rkf10w · 2021-05-11T22:42:43Z

Thanks, @datumbox, for your reply! Is there a discussion thread (or a GitHub issue) on the topic that I can read, or was it internal?

datumbox · 2021-05-12T17:03:36Z

Unfortunately many of these discussions happened outside of Github. This is quite problematic and we want to change it because in circumstances like this, it's hard to give information to people and does not help with the transparency... So, apologies for not being able to point you to a public thread... To solve the situation, below I provide a summary of what motivated the change. Let me know if you need more info:

Preset preprocessing transforms are those transformations applied to the data before feeding them to an ML model. They are typically separated into two categories: those applied during training and those during inference. Examples of such transforms include the Data Augmentation techniques, the Normalization/Scaling methods and other adhoc transformations applied to the data as a preliminary step (binary to bitmap conversion for Vision, Fast Fourier Transforms for Audio, Tokenization for Text etc).

The preset transforms are a crucial part of the model and having access to them is necessary to understand how a model was created and how to use it. Disclosing which training transforms were used is an important part for reproducibility and crucial to understand the assumptions and properties a specific model. The latter is particularly true while using transfer learning and porting a model from one domain to another (for example the Zoom transform can be used for augmentation in ImageNet Classification but not for Cancer Detection). Similarly having access to the transforms used during inference is critical because without them one can’t use the model.

This is the reason we decided to bring these Preset transforms as close to the training references as possible. By putting them together, the users are able to reproduce the training, adjust the scripts to meet their needs, do transfer learning etc. Hope that makes sense.

netw0rkf10w · 2021-05-12T17:09:09Z

@datumbox That totally makes sense! Thank you very much for your detailed explanation!

facebook-github-bot added the cla signed label Jan 28, 2021

Adding presets in the classification reference scripts.

992d41f

datumbox force-pushed the references/preset_transforms branch from 1c46a69 to 992d41f Compare January 28, 2021 12:09

Adding presets in the object detection reference scripts.

9f7a0f7

fmassa approved these changes Jan 28, 2021

View reviewed changes

Adding presets in the segmentation reference scripts.

a2e9306

datumbox force-pushed the references/preset_transforms branch from 71b7091 to a2e9306 Compare January 28, 2021 14:06

datumbox added 2 commits January 28, 2021 14:26

Adding presets in the video classification reference scripts.

826659e

Moving flip at the end to align with image classification signature.

ba326de

fmassa approved these changes Jan 28, 2021

View reviewed changes

datumbox changed the title ~~[WIP] Adding Preset Transforms in reference scripts~~ Adding Preset Transforms in reference scripts Jan 28, 2021

datumbox mentioned this pull request Jan 28, 2021

TorchVision Roadmap - 2021 H1 #3221

Closed

13 tasks

datumbox merged commit 1703e4c into pytorch:master Jan 28, 2021

datumbox deleted the references/preset_transforms branch January 28, 2021 15:08

zhangguanheng66 reviewed Feb 10, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding Preset Transforms in reference scripts #3317

Adding Preset Transforms in reference scripts #3317

Uh oh!

datumbox commented Jan 28, 2021 •

edited

Loading

Uh oh!

fmassa left a comment

Uh oh!

codecov bot commented Jan 28, 2021 •

edited

Loading

Uh oh!

fmassa left a comment

Uh oh!

zhangguanheng66 Feb 10, 2021

Uh oh!

datumbox Feb 10, 2021

Uh oh!

netw0rkf10w commented May 11, 2021

Uh oh!

datumbox commented May 11, 2021

Uh oh!

netw0rkf10w commented May 11, 2021

Uh oh!

datumbox commented May 12, 2021

Uh oh!

netw0rkf10w commented May 12, 2021

Uh oh!

Uh oh!

	model = MobileNetV3(inverted_residual_setting, last_channel, **kwargs)
	if pretrained:
	if model_urls.get(arch, None) is None:
	raise ValueError("No checkpoint is available for model type {}".format(arch))
	state_dict = load_state_dict_from_url(model_urls[arch], progress=progress)
	model.load_state_dict(state_dict)

Adding Preset Transforms in reference scripts #3317

Adding Preset Transforms in reference scripts #3317

Uh oh!

Conversation

datumbox commented Jan 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fmassa left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jan 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

fmassa left a comment

Choose a reason for hiding this comment

Uh oh!

zhangguanheng66 Feb 10, 2021

Choose a reason for hiding this comment

Uh oh!

datumbox Feb 10, 2021

Choose a reason for hiding this comment

Uh oh!

netw0rkf10w commented May 11, 2021

Uh oh!

datumbox commented May 11, 2021

Uh oh!

netw0rkf10w commented May 11, 2021

Uh oh!

datumbox commented May 12, 2021

Uh oh!

netw0rkf10w commented May 12, 2021

Uh oh!

Uh oh!

datumbox commented Jan 28, 2021 •

edited

Loading

codecov bot commented Jan 28, 2021 •

edited

Loading