HiFi-GAN
HiFi-GAN is a GAN-based neural vocoder
designed to generate high-fidelity
speech waveforms from mel spectrograms
with exceptional efficiency. It
introduces a g...
Enter
VoxCPM
VoxCPM is a tokenizer-free
text-to-speech system that models speech
in a continuous space, aiming for
extremely realistic, context-aware
synthesis and true-to-...
Enter
WaveRNN
WaveRNN is a PyTorch implementation of
DeepMinds WaveRNN vocoder, bundled with
a Tacotron-style TTS front end to form a
complete text-to-speech stack. As a vo...
Enter
Auto Synced Translated Dubs
Auto-Synced-Translated-Dubs is a
toolchain that automatically translates
and re-dubs videos using AI voices while
keeping the new speech aligned to the
origina...
Enter
Nimble
Use Nimble to express the expected
outcomes of Swift or Objective-C
expressions. Inspired by Cedar.
Apple's Xcode includes the XCTest
framework, which prov...
Enter
IMS Toucan
IMS-Toucan is a toolkit for training,
using, and teaching state-of-the-art
text-to-speech systems, built at the
Institute for Natural Language
Processing (IMS)...
Enter
Parallel WaveGAN
Parallel WaveGAN is an unofficial
PyTorch implementation of several
state-of-the-art non-autoregressive
neural vocoders, centered on Parallel
WaveGAN but also ...
Enter
Read Aloud
Read Aloud is a browser extension for
Chrome, Firefox, and other
Chromium-based browsers that converts
webpage text to audio using
text-to-speech technology. I...
Enter
OpenSeq2Seq
OpenSeq2Seq is a TensorFlow-based
toolkit for efficient experimentation
with sequence-to-sequence models across
speech and NLP tasks. Its core goal is
to give ...
Enter