Skip to content

TTS with coqui: Examples missing and Error 404: sendfile: file /tmp/generated/audio/piper.wav not found #1549

@dionysius

Description

@dionysius

LocalAI version:

REPOSITORY                   TAG                           IMAGE ID       CREATED        SIZE
quay.io/go-skynet/local-ai   master-cublas-cuda12-ffmpeg   649611dc96ae   21 hours ago   54.5GB

Environment, CPU architecture, OS, and Version:

Environment: Docker Desktop 4.26.1 (131620)
Operating System: Windows 10 Pro 64-bit (10.0, Build 19045) (19041.vb_release.191206-1406)
Processor: AMD Ryzen 5 7600X 6-Core Processor              (12 CPUs), ~4.7GHz
Memory: 32768MB RAM
Card name: NVIDIA GeForce RTX 3060 Ti
Dedicated Memory: 8038 MB
Driver Version: 31.0.15.4584

Describe the bug
Can't receive audio when requesting /tts endpoint. Errors with a 404, seeming to be unable to send the file (Probably was not generated).

To Reproduce

curl http://localhost:8080/tts -H "Content-Type: application/json" -d '{
  "backend": "coqui",
  "model": "tts_models/en/vctk/vits",
  "input": "Hello, world!"
}'
{"error":{"code":404,"message":"sendfile: file /tmp/generated/audio/piper.wav not found","type":""}}

Expected behavior
Sends the generated audio

Logs

2:43PM INF Loading model 'tts_models/en/vctk/vits' with backend coqui
2:43PM DBG Loading model in memory from file: /models/tts_models/en/vctk/vits
2:43PM DBG Loading Model tts_models/en/vctk/vits with gRPC (file: /models/tts_models/en/vctk/vits) (backend: coqui): {backendString:coqui model:tts_models/en/vctk/vits threads:0 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc0002c70e0 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama:/build/backend/python/exllama/run.sh exllama2:/build/backend/python/exllama2/run.sh huggingface-embeddings:/build/backend/python/sentencetransformers/run.sh petals:/build/backend/python/petals/run.sh sentencetransformers:/build/backend/python/sentencetransformers/run.sh transformers:/build/backend/python/transformers/run.sh transformers-musicgen:/build/backend/python/transformers-musicgen/run.sh vall-e-x:/build/backend/python/vall-e-x/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
2:43PM DBG Loading external backend: /build/backend/python/coqui/run.sh
2:43PM DBG Loading GRPC Process: /build/backend/python/coqui/run.sh
2:43PM DBG GRPC Service for tts_models/en/vctk/vits will be running at: '127.0.0.1:41873'
2:43PM DBG GRPC Service state dir: /tmp/go-processmanager296696034
2:43PM DBG GRPC Service Started
rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:41873: connect: connection refused"
rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:41873: connect: connection refused"
2:43PM DBG GRPC(tts_models/en/vctk/vits-127.0.0.1:41873): stderr Server started. Listening on: 127.0.0.1:41873
2:43PM DBG GRPC Service Ready
2:43PM DBG GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:<nil>} sizeCache:0 unknownFields:[] Model:tts_models/en/vctk/vits ContextSize:0 Seed:0 NBatch:0 F16Memory:false MLock:false MMap:false VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:0 MainGPU: TensorSplit: Threads:0 LibrarySearchPath: RopeFreqBase:0 RopeFreqScale:0 RMSNormEps:0 NGQA:0 ModelFile:/models/tts_models/en/vctk/vits Device: UseTriton:false ModelBaseName: UseFastTokenizer:false PipelineType: SchedulerType: CUDA:false CFGScale:0 IMG2IMG:false CLIPModel: CLIPSubfolder: CLIPSkip:0 ControlNet: Tokenizer: LoraBase: LoraAdapter: LoraScale:0 NoMulMatQ:false DraftModel: AudioPath: Quantization: MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0}
2:43PM DBG GRPC(tts_models/en/vctk/vits-127.0.0.1:41873): stderr Preparing models, please wait
2:43PM DBG GRPC(tts_models/en/vctk/vits-127.0.0.1:41873): stdout <TTS.utils.manage.ModelManager object at 0x7f156cb86a50>
2:43PM DBG GRPC(tts_models/en/vctk/vits-127.0.0.1:41873): stdout  > Downloading model to /root/.local/share/tts/tts_models--en--vctk--vits
[172.17.0.1]:59762 404 - POST /tts

Subsequent requests results are identical, the logs are shorter:

2:44PM INF Loading model 'tts_models/en/vctk/vits' with backend coqui
2:44PM DBG Model already loaded in memory: tts_models/en/vctk/vits
[172.17.0.1]:37434 404 - POST /tts

Additional context

  • I understand coqui is pretty fresh (and just got discontinued)
  • The models seem to be handled internally by coqui, I found the model string here
  • I also ran the same request a bit later in case the model download was still going

There is nothing generated:

root@6fc11ff6b221:/build# ls -lah /tmp/generated/audio/
total 8.0K
drwxr-xr-x 2 root root 4.0K Jan  5 14:43 .
drwxr-xr-x 3 root root 4.0K Jan  5 14:43 ..

There exists model files:

root@6fc11ff6b221:/build# ls -lah /root/.local/share/tts/tts_models--en--vctk--vits/
total 152M
drwxr-xr-x 2 root root 4.0K Jan  5 14:43 .
drwxr-xr-x 3 root root 4.0K Jan  5 14:43 ..
-rw-r--r-- 1 root root  12K Jan  5 14:43 config.json
-rw-r--r-- 1 root root 152M Jan  5 14:43 model_file.pth
-rw-r--r-- 1 root root 1.8K Jan  5 14:43 speaker_ids.json

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions