Transcribe

Transcribe speech to IPA using neurlang/ipa-whisper-small via the whisper.cpp inference server.

Usage

POST audio to /inference (multipart form field file):

curl -X POST http://127.0.0.1:8080/inference \
  -F "file=@input.wav" \
  -F "temperature=0.0" \
  -F "response_format=json"

Use 16 kHz mono WAV for best results. Convert with:

ffmpeg -i input.mp3 -ar 16000 -ac 1 -c:a pcm_s16le input.wav

Response is JSON; IPA text is in the segments (e.g. /ˌɪntərˈnæʃənəl/).

Setup

1. Model (one-time)

Convert and quantize the HuggingFace IPA model to ggml. See scripts/README.md for details.

./scripts/run-convert-ipa-to-ggml.sh              # → ggml (~466 MB)
export WHISPER_CPP_BUILD="$(pwd)/whisper.cpp/build"
./scripts/run-convert-ipa-to-ggml.sh --quantize   # → models/ggml-ipa-whisper-small-q5_0.bin (~182 MB)

Repo uses Git LFS for models/*.bin and bin/whisper-server. One-time: git lfs install. Then git add models/ggml-ipa-whisper-small-q5_0.bin and commit.

2. Build and run server

git submodule update --init
./scripts/build-server.sh
./scripts/run-server.sh

Server: http://0.0.0.0:8080. Install ffmpeg for non-WAV uploads (optional).

Deploy (Vercel)

Vercel’s builder has no cmake, so the whisper-server binary is prebuilt and committed via Git LFS (same approach as the model).

Prebuild binary (one-time or when updating whisper.cpp):
From repo root, run ./scripts/build-server-linux.sh (requires Docker). That produces a Linux x64 binary in bin/whisper-server. Then:
```
git lfs install   # if not already
git add bin/whisper-server
git commit -m "Update prebuilt whisper-server (Linux)"
git push
```
Model: Tracked with Git LFS; Vercel runs git lfs pull in the install step.
Build: npm run build uses the prebuilt bin/whisper-server from the repo (no cmake on Vercel). api/inference.ts proxies to it. Send WAV (16 kHz mono); no ffmpeg in bundle.
Deploy: npm i -g vercel then vercel. Env: MODEL_PATH, BIN_DIR (optional).
Limits: 250MB bundle, 1GB memory (in vercel.json). Use smaller model if needed.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
api		api
bin		bin
models		models
scripts		scripts
whisper.cpp @ 764482c		whisper.cpp @ 764482c
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
.vercelignore		.vercelignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
vercel.json		vercel.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transcribe

Usage

Setup

1. Model (one-time)

2. Build and run server

Deploy (Vercel)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Transcribe

Usage

Setup

1. Model (one-time)

2. Build and run server

Deploy (Vercel)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages