Add Automatic Text Segmentation for Long Input Handling #3

LiangLuDev · 2025-10-31T06:47:43Z

Error info:
Failed to synthesize speech: Size (273) of dimension (1) is not in allowed range (2..240)

Problem

The TTS models have a strict input length constraint of 2-240 tokens. When input text exceeded this limit, synthesis would fail with: Size (273) of dimension (1) is not in allowed range
(2..240)

Solution

Implemented a multi-level text segmentation strategy that automatically handles long input text:

Three-tier fallback mechanism:

Sentence-level segmentation - Split by sentence boundaries (.!?。!?)
Comma-level segmentation - For long sentences, split by commas (,，、)
Word-level segmentation - For continuous text without punctuation, split by whitespace

Each level validates token count and only proceeds to the next level if needed. Segments are synthesized independently and concatenated with 300ms natural pauses between them.

Key Features

Automatic handling of texts exceeding 240-token limit
Smart merging to maximize segment length within constraints
Natural sentence pauses (300ms) between concatenated segments
Maintains backward compatibility - short texts process unchanged
Clear error messages for edge cases (e.g., single word >240 tokens)

Changes

Added inputTooLong error case with descriptive message
Added segmentText() for sentence boundary detection
Added segmentByCommas() for comma-based splitting
Added segmentByWords() as final fallback for continuous text
Added concatenateBuffers() for seamless audio merging
Refactored generate() to handle multi-segment synthesis

automatically split it into segments based on punctuation

7d453a1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Automatic Text Segmentation for Long Input Handling #3

Add Automatic Text Segmentation for Long Input Handling #3

Uh oh!

LiangLuDev commented Oct 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add Automatic Text Segmentation for Long Input Handling #3

Are you sure you want to change the base?

Add Automatic Text Segmentation for Long Input Handling #3

Uh oh!

Conversation

LiangLuDev commented Oct 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant