This is the Linux app named Open Speech Corpora whose latest release can be downloaded as open-speech-corporasourcecode.zip. It can be run online in the free hosting provider OnWorks for workstations.
Download and run online this app named Open Speech Corpora with OnWorks for free.
Follow these instructions in order to run this app:
- 1. Downloaded this application in your PC.
- 2. Enter in our file manager https://www.onworks.net/myfiles.php?username=XXXXX with the username that you want.
- 3. Upload this application in such filemanager.
- 4. Start the OnWorks Linux online or Windows online emulator or MACOS online emulator from this website.
- 5. From the OnWorks Linux OS you have just started, goto our file manager https://www.onworks.net/myfiles.php?username=XXXXX with the username that you want.
- 6. Download the application, install it and run it.
SCREENSHOTS
Ad
Open Speech Corpora
DESCRIPTION
Open Speech Corpora is a curated catalog of speech datasets intended to support research and development in automatic speech recognition, text-to-speech, and other speech technologies. The repository is organized as a set of tables that list corpora along with their languages, total hours, number of speakers, download links, and licenses, giving practitioners a quick way to find data that matches their needs. It emphasizes free and truly “open” datasets, favoring those released under Creative Commons or community-friendly data licenses, though it also lists corpora that are accessible for research and many commercial uses. The catalog covers well-known resources such as Mozilla Common Voice, Yesno, LJ Speech and numerous Nordic and parliamentary speech corpora, along with their license variants like CC-0 and CC-BY. It is actively maintained as a community resource: users are encouraged to propose new corpora via issues, and there is a backlog of datasets waiting to be integrated.
Features
- Centralized catalog of speech corpora for ASR, TTS and related tasks
- Detailed metadata including language, duration, speakers, download links and licenses
- Emphasis on free and open datasets suitable for research and many commercial uses
- Coverage of popular corpora like Common Voice, LJ Speech and multiple Nordic resources
- Community-driven updates via issues and pull requests to keep the list evolving
- License-based grouping (CC-0, CC-BY and more) to simplify compliance and dataset selection
Categories
This is an application that can also be fetched from https://sourceforge.net/projects/open-speech-corpora.mirror/. It has been hosted in OnWorks in order to be run online in an easiest way from one of our free Operative Systems.
