Efficient neural speech synthesis
Open-Source Large Vocabulary Continuous Speech Recognition Engine
A lightweight yet powerful audio-to-MIDI converter with pitch bend detection
Deep Speaker: an End-to-End Neural Speaker Embedding System.
Large, modern dataset for speech recognition
Unofficial PyTorch implementation of Google AI's VoiceFilter system
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
A PyTorch-based Speech Toolkit
Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.