Skip to content

Conversation

pragmatrix
Copy link
Owner

@pragmatrix pragmatrix commented Aug 1, 2025

Azure speech detection seems to be very sensitive. It detects very faint speech which makes it to also detect echos from speakers, for example when speech detection is active and someone else speaks on the other line.

This PR adds a speech gate that is loosely configured to dampen low volume audio, noise and also whispering sounds. The algorithm was mostly done by Claude Sonnet 3.7 and the parameterization was derived from iterating on two samples (normal speech and echo speech).

@pragmatrix pragmatrix marked this pull request as ready for review August 1, 2025 14:50
@pragmatrix pragmatrix requested a review from Copilot August 1, 2025 14:50
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a "speech gate" to address Azure speech detection sensitivity issues by filtering out low-volume audio, noise, and echo detection. The gate uses configurable parameters to dampen unwanted audio while preserving normal speech.

Key changes:

  • Adds a new speech gate module with RMS-based filtering algorithm
  • Integrates the speech gate into Azure transcription service
  • Includes a standalone test utility for audio processing validation

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
core/src/speech_gate.rs Implements the main speech gate algorithm with multiple variants (hard/soft/RMS-based)
core/src/lib.rs Exports the speech gate module
src/lib.rs Exports the speech gate processor function
services/azure/src/transcribe.rs Integrates speech gate into Azure transcription pipeline
filter-test/ Adds a standalone CLI tool for testing speech gate on audio files
Cargo.toml Adds fundsp dependency and filter-test workspace member
Comments suppressed due to low confidence (1)

filter-test/Cargo.toml:4

  • The Rust edition "2024" does not exist. The latest stable edition is "2021". Change this to "2021".
edition = "2024"

@pragmatrix pragmatrix merged commit 93998af into master Aug 1, 2025
6 checks passed
@pragmatrix pragmatrix deleted the speech-gate branch August 1, 2025 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant