Skip to content

Pure managed C# port of WebRTC's voice activity detection (VAD). No native dependencies.

License

Notifications You must be signed in to change notification settings

devnull/WebRtcVad.NET

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WebRtcVad.NET

Pure managed C# port of WebRTC's voice activity detection (VAD) algorithm.

No native dependencies - works on any platform that supports .NET 8+.

Build NuGet

Installation

dotnet add package WebRtcVad.NET

Usage

using WebRtcVad.NET;

// Create a VAD instance
using var vad = new WebRtcVad
{
    SampleRate = SampleRate.Rate16kHz,
    FrameLength = FrameLength.Length20ms,
    OperatingMode = OperatingMode.Aggressive
};

// Check if audio frame contains speech
byte[] audioFrame = GetAudioFrame(); // 16-bit PCM, little-endian
bool hasSpeech = vad.HasSpeech(audioFrame);

// Or use Span<short> for better performance
Span<short> samples = GetAudioSamples();
bool hasSpeech = vad.HasSpeech(samples);

Configuration

Sample Rates

  • SampleRate.Rate8kHz
  • SampleRate.Rate16kHz
  • SampleRate.Rate32kHz
  • SampleRate.Rate48kHz

Frame Lengths

  • FrameLength.Length10ms
  • FrameLength.Length20ms
  • FrameLength.Length30ms

Operating Modes

Mode Description
Quality Least aggressive - best for clean audio
LowBitrate Slightly more aggressive
Aggressive For moderately noisy audio
VeryAggressive Most aggressive - for very noisy audio

Higher aggressiveness = fewer false positives (noise detected as speech) but more false negatives (missed speech).

Audio Format

This library only supports raw 16-bit linear PCM audio. It will not work with WAV files or other container formats directly - you need to extract the raw PCM data first.

Algorithm

This is a bit-exact port of the WebRTC VAD algorithm, which uses a Gaussian Mixture Model (GMM) to classify audio frames as speech or non-speech. The algorithm:

  1. Splits audio into 6 frequency bands
  2. Calculates energy features for each band
  3. Uses GMM to compute speech/noise probabilities
  4. Applies adaptive thresholds based on operating mode

License

MIT License - see LICENSE

The algorithm is ported from WebRTC, which is licensed under the WebRTC License.

About

Pure managed C# port of WebRTC's voice activity detection (VAD). No native dependencies.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages