Update on TorchAudio’s future

Dear TorchAudio users,

TorchAudio is the most popular audio library for PyTorch. It has critical transforms, models and datasets that we know the community relies on. That is why we wanted to let the community know that we have started a refactoring effort to transition TorchAudio into a **maintenance phase**. This process will involve **removal of some user-facing features**. We have three goals we want to achieve with this effort:

1. **Make TorchAudio easier to maintain to ensure long-term reliability.** We plan to eliminate all C++ code so that TorchAudio is a Python-only library. We also plan to reduce external dependencies as much as possible. Both efforts will simplify testing and release.
2. **Reduce redundancies with the rest of the PyTorch ecosystem.** Some of the functionality in TorchAudio is also available in TorchVision and TorchCodec. We are working across all three libraries to ensure a given capability lives in one library.
3. **Focus on TorchAudio’s strengths.** Those strengths are the audio transforms, models and datasets that are integral to users training and inference pipelines. As a result, we will deprecate and eventually remove some functionality that is outside of these strengths.  

The diagram below depicts the various components of TorchAudio. We have highlighted it according to the user-facing API changes that we are making:

<img width="735" alt="Image" src="https://github.com/user-attachments/assets/ecf01218-77fb-49c7-96bf-55e582111954" />

Starting with TorchAudio 2.8 (expected around August 2025), APIs slated for removal will trigger a deprecation warning. These APIs will be fully removed in TorchAudio 2.9 (anticipated by the end of 2025).

Most of the APIs in `transforms`, `functional`, `compliance.kaldi`, `models` and `pipelines` modules will remain. These are the APIs that we identified as the most popular and valuable ones.

* A few APIs, specifically those relying on C++ implementations like RNNT loss and forced-alignment, may be dropped. Some, like `lfilter` and `overdrive`, will switch to pure-Python implementations, which might affect performance. We are exploring options to retain C++-backed APIs, but this is unlikely.
* Remaining APIs will be compatible with the latest stable PyTorch version. No new features will be added.

**The decoding and encoding capabilities of TorchAudio for both audio and video data will migrate to [TorchCodec](https://github.com/pytorch/torchcodec)**, where we are consolidating all of PyTorch media decoding and encoding. TorchAudio’s decoding and encoding APIs will be deprecated from TorchAudio 2.8, and they will be removed in TorchAudio 2.9, so we encourage users to migrate to TorchCodec as soon as possible. TorchCodec already supports video and audio decoding, and encoding will be supported soon. While there isn't a direct 1:1 API mapping, the migration process should be smooth. Please report any issues in the [TorchCodec repository](https://github.com/pytorch/torchcodec/issues).

**All other modules and APIs will be removed in TorchAudio 2.9.**

We understand  that these changes may be disruptive. We believe that they are unfortunately necessary, in order for us to guarantee TorchAudio’s stability in the future.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update on TorchAudio’s future #3902

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Update on TorchAudio’s future #3902

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions