-
Notifications
You must be signed in to change notification settings - Fork 712
Added encoding and bits_per_sample to soundfile's backend save() #1274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added encoding and bits_per_sample to soundfile's backend save() #1274
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prabhat00155 thanks for working on this. The PR looks good overall.
@prabhat00155 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update. I tested the code and it works fine for wav
format.
I have not checked flac/vorbis but for sph format there is an issue for subtype generation. Please refer to the comment.
The following is the script I used.
import torch
import torchaudio.backend._soundfile_backend as backend
data = torch.randn(2, 124, dtype=torch.float32)
encs = [
(None, 8),
('PCM_U', None),
('PCM_U', 8),
('PCM_S', None),
('PCM_S', 16),
('PCM_S', 32),
('PCM_F', None),
('PCM_F', 32),
('PCM_F', 64),
('ULAW', None),
('ULAW', 8),
('ALAW', None),
('ALAW', 8),
]
for enc, bps in encs:
path = f'tmp3/{enc}_{bps}.wav'
print(path)
backend.save(path, data, sample_rate=8000, encoding=enc, bits_per_sample=bps)
dtypes = [
# torch.uint8,
torch.int16,
torch.int32,
torch.float32,
]
for dtype in dtypes:
path = f'tmp3/None_None_{dtype}.wav'
print(path)
backend.save(path, data.to(dtype), sample_rate=8000)
# bpss = [
# # 8
# # 16,
# # 24,
# # 32,
# None,
# ]
# for bps in bpss:
# path = f'tmp3/{bps}.flac'
# backend.save(path, data, sample_rate=8000, encoding=None, bits_per_sample=bps)
encs = [
('PCM_S', None),
('PCM_S', 16),
('PCM_S', 32),
('ULAW', None),
('ULAW', 8),
('ALAW', None),
('ALAW', 8),
]
for enc, bps in encs:
path = f'tmp3/{enc}_{bps}.sph'
print(path)
backend.save(path, data, sample_rate=8000, encoding=enc, bits_per_sample=bps)
Can you generate sph with ALAW encoding? I realized that the resulting files from the script above cannot be opened with
|
SoX doesn't seem to support ALAW for sphere, check this: https://sourceforge.net/p/sox/code/ci/master/tree/src/sphere.c#l79. |
You are right.
|
@prabhat00155 Thanks! Can you make (cherry-pick) the same commit and make a PR against |
…orch#1274) (cherry picked from commit b8fd5e9)
…orch#1274) (cherry picked from commit b8fd5e9)
* Add fx graph mode ptq static tuttorial * Add fx graph mode ptq static tuttorial * Remove `_tutorial` from the name so it doesn't build, will add _tutorial after 1.8 Co-authored-by: Brian Johnson <[email protected]>
Addresses #1258.