Rollout planning for transforms v2

This issue is for discussing how and when we are going to roll out transforms v2 from `torchvision.prototype.transforms` to `torchvision.transforms`. The new API depends on the `torchvision.prototype.datapoints` namespace. Since no equivalent exist in the stable `torchvision` yet, we can simply move it.

### Functional API

The functional API of v2 is fully BC with v1, and so we can just drop it in. 

In addition, since v2 now has public kernels for tensor and PIL images, we should also deprecate the `torchvision/transforms/functional_{pil,tensor}.py` modules. Although not prefixed by an underscore, they were always considered private. Note that the public kernels of v2 are mostly but not always a drop-in replacement for the private kernels of v1. In v1 sometimes some common preprocessing was done on the dispatcher, e.g.

https://github.com/pytorch/vision/blob/8985b598a69250d65959941c863d76a4225ae7ac/torchvision/transforms/functional.py#L690-L696

https://github.com/pytorch/vision/blob/8985b598a69250d65959941c863d76a4225ae7ac/torchvision/transforms/functional.py#L725

https://github.com/pytorch/vision/blob/8985b598a69250d65959941c863d76a4225ae7ac/torchvision/transforms/functional_tensor.py#L695-L700

Since in v2 the kernels have to be able to stand alone, this preprocessing had to be moved into the kernels and thus changing the signature:

https://github.com/pytorch/vision/blob/8985b598a69250d65959941c863d76a4225ae7ac/torchvision/prototype/transforms/functional/_geometry.py#L1457-L1464

https://github.com/pytorch/vision/blob/8985b598a69250d65959941c863d76a4225ae7ac/torchvision/prototype/transforms/functional/_geometry.py#L1272-L1279


But, to reiterate, since the kernels were considered private in v1, this doesn't constitute a BC break.

### Class API

The class API of v2 breaks BC in two ways compared to v1:

- The transforms are no longer JIT scriptable. JIT requires pretty static behavior, but core design choices of the transforms go against that. For example, the independence of the input structure as well as the dynamic dispatch are not supported by JIT. Meaning, JIT is simply not compatible with what we want to achieve.
- [EDIT] *we have decided to provide BC for these methods (at least with a proper deprecation cycle* https://github.com/pytorch/vision/pull/7092#issuecomment-1384325813)
  [Some random transformations](https://github.com/pytorch/vision/pull/7092#issuecomment-1384533260) have a public and static method `get_params` that allows users to perform the random sampling without instantiating the transform. This is a common idiom for v1, since it doesn't support transforming multiple inputs jointly. Meaning, to transform an image and a bounding box at the same time, one would sample the parameters once by calling `get_params` and subsequently call the respective functionals themselves. The latter can range from a single call to more complex ones that basically replicate the original transform. Transforms v2 supports jointly transforming multiple datapoints natively and thus there is no longer a need for this to be public.
  Since the random sampling of course still needs to happen, it is possible to [temporarily keep the public method](https://github.com/pytorch/vision/pull/7095), but start to deprecate it as soon as transforms v2 is considered stable. We have identified another low-priority use case for a public `get_params`, but there is already a [design proposal](https://github.com/pytorch/vision/issues/7027) for v2 that would improve the UX over v1 quite a bit and doesn't rely on public parameter sampling.

The BC breakages are not random, but transforms v2 brings a lot of new functionality. Although we have extensive tests and made sure our [own training pipelines run smoothly with it](https://github.com/pytorch/vision/pull/6433), the API cannot be considered stable from the get go, since it wasn't battle tested yet. This means that we will start out in a beta state. We are confident that we can bring it to a stable state in one or two release cycles. During this time we are not yet bound to BC, but we don't expect any large scale changes.

Even after transforms v2 is considered stable, we can't replace v1 directly due to the BC breakages. Thus, the roll-out plan suggested here is to create two new namespaces: `torchvision.transforms.v1` and `torchvision.transforms.v2`. With that, both versions of the API can coexist until we are confident that v2 can replace v1. Initially `torchvision/transforms/__init__.py` will do `from .v1 import *` and thus users don't have to change anything. As soon as we consider v2 stable, we deprecate the two features of v1 for which we won't keep BC, but only under the main namespace, i.e. `torchvision.transforms`. Users that imported directly from `torchvision.transforms.v1` should not see a warning. Finally, after the deprecation period is over, we switch the import in `torchvision/transforms/__init__.py` to `from .v2 import *` and complete the transition.

Afterwards, we still need to decide what we do with the `v1` and `v2` namespaces:

- `v1`: We could keep this indefinitely so users for which the parts where we break BC are critical, can continue to use the v1 transforms and only need to change an import once. Given that only the JIT scriptability is broken without replacement, we could also start to deprecate the namespace as soon as JIT is deprecated in PyTorch core. It makes little sense to keep the old API around if their only significant difference is something that is no longer maintained or maybe even replaced by something else. In any case, everything in this namespace would no longer be maintained by us.
- `v2`: Although this obsolete after the transition is complete, we might still keep it around to minimize disruption. Users that started using the v2 API before the transition was complete, could continue to import `torchvision.transforms.v2` and are not forced to change the namespace a second time. 

### Timeline summary

- `0.15`:
  - Move `torchvision.prototype.datapoints` to a new namespace `torchvision.datapoints`
  - Drop `torchvision.prototype.transforms.functional` into `torchvision.transforms.functional`
  - Deprecate `torchvision.transforms.functional_pil` and `torchvision.transforms.functional_tensor` [^1]
  - Move `torchvision.transforms.transforms` to a new namespace `torchvision.transforms.v1` and deprecate the old one [^1]
  - Move `torchvision.prototype.transforms` to a new namespace `torchvision.transforms.v2`
  - Expose everything from `torchvision.transforms.v1` through `torchvision.transforms`
- `0.15 + x` [^2]
  - Deprecate JIT scriptability and usage of static `get_params` methods where applicable on transforms exposed through `torchvision.transforms`. Let the warning point `torchvision.transforms.v1` in case the deprecated functionality is critical for the users.
- `0.15 + x + 2` [^2]
  - Expose everything from `torchvision.transforms.v2` through `torchvision.transforms`
  - Keep `torchvision.transforms.v1` indefinitely or until JIT is deprecated from PyTorch core, albeit unmaintained in any case

Going by our regular release cycles, this means transforms v2 should be accessible from a stable release in H1 2023, will likely be considered stable in H2 2023 and fully replace transforms v1 in H1 2024.

[^1]: Although initiated by the v2 roll out, the deprecation is independent of it and should happen according to the deprecation policy.
[^2]: `x` denotes the time we need to bring the transforms v2 from a beta to a stable state. Current estimate is one or two release cycles.

cc @vfdev-5 @datumbox @bjuncek

	def perspective(
	img: Tensor,
	startpoints: List[List[int]],
	endpoints: List[List[int]],
	interpolation: InterpolationMode = InterpolationMode.BILINEAR,
	fill: Optional[List[float]] = None,
	) -> Tensor:

	def perspective(
	img: Tensor,
	perspective_coeffs: List[float],
	interpolation: str = "bilinear",
	fill: Optional[Union[int, float, List[float]]] = None,
	) -> Tensor:

	def perspective(
	inpt: datapoints.InputTypeJIT,
	startpoints: Optional[List[List[int]]],
	endpoints: Optional[List[List[int]]],
	interpolation: InterpolationMode = InterpolationMode.BILINEAR,
	fill: datapoints.FillTypeJIT = None,
	coefficients: Optional[List[float]] = None,
	) -> datapoints.InputTypeJIT:

	def perspective_image_tensor(
	image: torch.Tensor,
	startpoints: Optional[List[List[int]]],
	endpoints: Optional[List[List[int]]],
	interpolation: InterpolationMode = InterpolationMode.BILINEAR,
	fill: datapoints.FillTypeJIT = None,
	coefficients: Optional[List[float]] = None,
	) -> torch.Tensor:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Rollout planning for transforms v2 #7097

Functional API

Class API

Timeline summary

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Rollout planning for transforms v2 #7097

Description

Functional API

Class API

Timeline summary

Footnotes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions