GoGPT Best VPN GoSearch

OnWorks favicon

WhisperSpeech download for Linux

Free download WhisperSpeech Linux app to run online in Ubuntu online, Fedora online or Debian online

This is the Linux app named WhisperSpeech whose latest release can be downloaded as WhisperSpeechsourcecode.tar.gz. It can be run online in the free hosting provider OnWorks for workstations.

Download and run online this app named WhisperSpeech with OnWorks for free.

Follow these instructions in order to run this app:

- 1. Downloaded this application in your PC.

- 2. Enter in our file manager https://www.onworks.net/myfiles.php?username=XXXXX with the username that you want.

- 3. Upload this application in such filemanager.

- 4. Start the OnWorks Linux online or Windows online emulator or MACOS online emulator from this website.

- 5. From the OnWorks Linux OS you have just started, goto our file manager https://www.onworks.net/myfiles.php?username=XXXXX with the username that you want.

- 6. Download the application, install it and run it.

SCREENSHOTS

Ad


WhisperSpeech


DESCRIPTION

WhisperSpeech is an open-source text-to-speech system created by “inverting” OpenAI’s Whisper, reusing its strengths as a semantic audio model to generate speech instead of only transcribing it. The project aims to be for speech what Stable Diffusion is for images: powerful, hackable, and safe for commercial use, with code under Apache-2.0/MIT and models trained only on properly licensed data. Its architecture follows a token-based, multi-stage pipeline inspired by AudioLM and SPEAR-TTS: Whisper is used to produce semantic tokens, EnCodec compresses the waveform into acoustic tokens, and Vocos reconstructs high-fidelity audio from those tokens. The repository includes notebooks and scripts for inference, long-form synthesis, and finetuning, as well as pre-trained models and converted datasets hosted on Hugging Face. Performance optimizations like torch.compile, KV-caching, and architectural tweaks allow the main model to reach up to 12× real-time speed on a consumer RTX 4090.



Features

  • Text-to-speech system built by inverting Whisper into a semantic token generator
  • Three-stage pipeline using Whisper (semantic), EnCodec (acoustic tokens), and Vocos (vocoder)
  • Open-source code under Apache-2.0/MIT with models trained on properly licensed datasets
  • High-performance inference with optimizations like torch.compile and KV-caching for 10×+ real-time speed on GPUs
  • Support for voice cloning, multilingual experiments, and code-switching within a single utterance
  • Notebooks and scripts for long-form generation, finetuning, and community-driven benchmarking


Programming Language

Python


Categories

Text to Speech

This is an application that can also be fetched from https://sourceforge.net/projects/whisperspeech.mirror/. It has been hosted in OnWorks in order to be run online in an easiest way from one of our free Operative Systems.


Free Servers & Workstations

Download Windows & Linux apps

Linux commands

Ad




×
Advertisement
❤️Shop, book, or buy here — no cost, helps keep services free.