Add SGLang backend for Orpheus TTS with streaming SSE and SNAC-safe token flow (#SGlang support) #270
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary: Introduces an optional SGLang server backend to
OrpheusModel
that streams cumulative text via OpenAI-compatible Completions SSE, preserving Orpheus’s SNAC token parsing and real-time audio pipeline.Implementation:
backend='sglang_server'
withsglang_base_url
,sglang_model
, optionalsglang_api_key
/ headers./v1/completions
(no chat template) and streams cumulative text to keep the decoder’s last-<custom_token_####>
extraction stable.stop_token_ids
to tokenizer-decoded strings for accurate stop behavior on SGLang.Fixes/Hardening:
_map_model_params
key lookup.validate_voice
now checksavailable_voices
; added"tara"
since it’s used as default and in examples.requests
toinstall_requires
.Why SGLang:
Usage:
python -m sglang.launch_server --model-path canopylabs/orpheus-tts-0.1-finetune-prod --host 0.0.0.0 --port 30000 --mem-fraction-static 0.8 --stream-interval 1
No API breaks; default remains vLLM.