Skip to content

Commit de8a874

Browse files
Merge pull request #9 from bunyaminergen/develop
MP-SENet Speech Enhancement
2 parents 14bed1b + 2ece714 commit de8a874

File tree

12 files changed

+292
-79
lines changed

12 files changed

+292
-79
lines changed
Binary file not shown.

.db/Callytics.sqlite

8 KB
Binary file not shown.

.docs/documentation/RESOURCES.md

Lines changed: 72 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,13 @@
1414
- [Llama Recipes: Examples to get started using the Llama models from Meta](https://github.com/meta-llama/llama-recipes)
1515
- [timsainb/noisereduce: Noise reduction in python using spectral gating](https://github.com/timsainb/noisereduce/)
1616
- [pyannote/pyannote-audio: Neural building blocks for speaker diarization](https://github.com/pyannote/pyannote-audio)
17+
- [microsoft/DNS-Challenge: This repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.](https://github.com/microsoft/DNS-Challenge)
18+
- [WenzheLiu-Speech/awesome-speech-enhancement: speech enhancement\speech seperation\sound source localization](https://github.com/WenzheLiu-Speech/awesome-speech-enhancement)
19+
- [nanahou/Awesome-Speech-Enhancement: A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.](https://github.com/nanahou/Awesome-Speech-Enhancement)
20+
- [jonashaag/speech-enhancement: Collection of papers, datasets and tools on the topic of Speech Dereverberation and Speech Enhancement](https://github.com/jonashaag/speech-enhancement)
21+
- [yxlu-0102/MP-SENet: Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement](https://github.com/yxlu-0102/MP-SENet)
22+
- [Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement](https://yxlu-0102.github.io/MP-SENet/)
23+
- [## SUPERSEDED: THIS DATASET HAS BEEN REPLACED. ## Noisy speech database for training speech enhancement algorithms and TTS models](https://datashare.ed.ac.uk/handle/10283/1942)
1724

1825
---
1926

@@ -34,7 +41,8 @@
3441

3542
## PyPI
3643

37-
- [demucs · PyPI](https://pypi.org/project/demucs/)
44+
- [demucs](https://pypi.org/project/demucs/)
45+
- [MPSENet](https://pypi.org/project/MPSENet/)
3846

3947
---
4048

@@ -43,3 +51,66 @@
4351
- [`The file is already fully retrieved; nothing to do.`](https://github.com/facebookresearch/llama/issues/760)
4452

4553
---
54+
55+
## Paper
56+
57+
- [Dual-Path Transformer Network: Direct Context-Aware Modeling for End-to-End Monaural Speech Separation](https://arxiv.org/abs/2007.13975)
58+
- [MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra](https://arxiv.org/abs/2305.13686)
59+
- [FINALLY: fast and universal speech enhancement with studio-like quality](https://arxiv.org/abs/2410.05920)
60+
- [Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement](https://arxiv.org/abs/2308.08926)
61+
62+
---
63+
64+
## Youtube
65+
66+
- [A Course on Speech Enhancement](https://www.youtube.com/playlist?list=PLO9nFIQB53_DU8o0fToNdNFdZuDxD9fAN)
67+
- [COMS 4995 Final on Speech Enhancement](https://www.youtube.com/watch?v=uRwlSh1FMzc&t=74s)
68+
- [Achieving Studio-Quality Speech with Generative AI](https://www.youtube.com/watch?v=UxbEjpLMU8s)
69+
- [How to Fix Bad Podcast Audio](https://www.youtube.com/watch?v=0mPkPQNHsZc)
70+
- [Speech Enhancement for Cochlear Implant Recipients Using Deep Complex Convolution Transformer With F](https://www.youtube.com/watch?v=i1qTgjMtS2Y)
71+
- [Vibravox: A Dataset of French Speech Captured with Body-conduction Audio Sensors](https://www.youtube.com/watch?v=4jiQdotz6qY)
72+
- [2024 종합설계 3팀 2차, Neural Network for Speech Enhancement](https://www.youtube.com/watch?v=yOfTYuc9FEQ)
73+
- [MIAI Deeptails Seminar : Generative Models as Data-driven Priors for Speech Enhancement](https://www.youtube.com/watch?v=XSLgUsgyzUA)
74+
- [Hardware Efficient Speech Enhancement With Noise Aware Multi Target Deep Learning](https://www.youtube.com/watch?v=qO6JqDUQlsI)
75+
- [Diffusion Models for Speech Enhancement | Julius Richter](https://www.youtube.com/watch?v=HMrs6YWDl5M)
76+
- [Speech Enhancement: Basics & Key Details](https://www.youtube.com/watch?v=5kItH2pq_3E)
77+
- [Guided Speech Enhancement Network (ICASSP 2023)](https://www.youtube.com/watch?v=JoDqXkAjlh4)
78+
- [VSANet: Real-time Speech Enhancement Based on Voice Activity Detection and Causal Spatial Attention](https://www.youtube.com/watch?v=GP39vFA2E48)
79+
- [Research intern talk: Unified speech enhancement approach for speech degradation & noise suppression](https://www.youtube.com/watch?v=_ggfv6eMIJs)
80+
- [Magnitude and phase spectrum with example](https://www.youtube.com/watch?v=MFOjUgafq0k)
81+
- [Deep Learning In Audio for Absolute Beginners: From No Experience & No Datasets to a Deployed Model](https://www.youtube.com/watch?v=sqrah49GUkI)
82+
- [Look Once to Hear: Target Speech Hearing with Noisy Examples](https://www.youtube.com/watch?v=V-XCfnjfQmM)
83+
84+
---
85+
86+
## Wikipedia
87+
88+
- [Speech enhancement](https://en.m.wikipedia.org/wiki/Speech_enhancement)
89+
90+
---
91+
92+
## Hugging Face
93+
94+
- [Models(asteroid)](https://huggingface.co/models?library=asteroid)
95+
- [cankeles/DPTNet_WHAMR_enhsingle_16k](https://huggingface.co/cankeles/DPTNet_WHAMR_enhsingle_16k)
96+
- [JacobLinCool/MP-SENet-VB](https://huggingface.co/JacobLinCool/MP-SENet-VB)
97+
- [JacobLinCool/MP-SENet-DNS](https://huggingface.co/JacobLinCool/MP-SENet-DNS)
98+
- [ENOT-AutoDL/MP-SENet](https://huggingface.co/ENOT-AutoDL/MP-SENet)
99+
100+
---
101+
102+
## Web
103+
104+
- [Dual-Path Transformer Network: Direct Context-Aware Modeling for End-to-End Monaural Speech Separation](https://paperswithcode.com/paper/dual-path-transformer-network-direct-context-1)
105+
- [The Audio Developer Conference - ADC is an annual event celebrating all audio development technologies, from music applications and game audio to audio processing and embedded systems.](https://audio.dev/)
106+
- [Look Once to Hear: Target Speech Hearing with Noisy Examples - CHI '24](https://programs.sigchi.org/chi/2024/program/content/147319)
107+
- [Reinforcement Learning Based Speech Enhancement for Robust Speech Recognition > Introduction | Class Central Classroom](https://www.classcentral.com/classroom/youtube-reinforcement-learning-based-speech-enhancement-for-robust-speech-recognition-131999)
108+
109+
---
110+
111+
## Dataset
112+
113+
- [VoiceBank+DEMAND](https://datashare.ed.ac.uk/handle/10283/1942)
114+
- [VoiceBank+DEMAND](https://drive.google.com/drive/folders/19I_thf6F396y5gZxLTxYIojZXC0Ywm8l)
115+
116+
---

0 commit comments

Comments
 (0)