Add DiffSinger BRAPA Phonemizer and BRAPA G2P model #1841
+1,686
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add DiffSinger BRAPA Phonemizer
Overview
This PR adds
DiffSingerBrapaPhonemizer, an advanced Portuguese phonemizer for the DiffSinger engine that provides sophisticated phonetic processing and accent support beyond the basic Portuguese phonemizer.Key Features
Advanced Phonetic Processing
Extends
DiffSingerRefinedPhonemizerfor advanced phonetic rule applicationDuration-Based Rules
Implements smart phoneme substitutions for short notes:
a→axfor duration ≤ 45msi→i0for duration ≤ 45msu→u0for duration ≤ 45msWord Boundary Rules
s9→zbefore vowels,s9→z9before voiced consonantsr9→rbefore vowelscl→ngwhen preceded by nasal vowels and followed by voiced plosivesℹ️Rules application can be skipped if the lyrics has phonetic hint
[ ]or it starts with/Technical Implementation
dsdict-brapa.yamlfor Portuguese-specific phonetic rulesBrapaG2pclass for grapheme-to-phoneme conversionBRAPA G2P Model
Overview
This PR also introduces the BRAPA (Brazilian Phonetic Alphabet) G2P model, a specialized grapheme-to-phoneme converter optimized for Brazilian Portuguese with enhanced accent handling and extended phonetic features.
Key Features
Enhanced Brazilian Portuguese Support
New Dummy Phonemes for Phonetic Precision
Added specialized placeholder phonemes for accurate phonetic representation:
s9: Represents S-sound before consonants or at word endings. Can be changed intosorshz9: Represents Z-sound before voiced consonants. Can be changed intozorjh9: Represents R-sound before vowels. Can be changed intoh,hrorxr9: Generic placeholder for rhotic sounds. Can be changed intoh,hr,r,rworxExtended Phonetic Support
Added extra phonemes for better accent handling
ah: Represents the default vowel phoneme in stressed syllables where the letter “a” precedes a nasal consonant. Can be orthographically represented as "a" or "â". Examples:ng: Default phoneme for nasal vowel +ginteraction. Can be changed intogwn: Default phoneme for nasal vowel +winteraction. Can be changed intowyn: Default phoneme for nasal vowel +yinteraction. Can be changed intoyAdvanced Lyric Processing Features
New markup system for vocal processing in lyrics:
cl(clousure/glottal stop)vf(vocal fry)Usage
For models / voicebanks that has Portuguese capabilities with BRAPA, this replacements are recommended:
Techninical Features
Dictionary Entries Total:
7517026766154701646716467Vowels:
Consonants
Drawbacks
RNN-T Models is not good with complex languages, so OOV entries can give wrong results. If that happens, try writing how the word is spoken instead of written