Showing 9 of 9 projects
A deep learning toolkit for Text-to-Speech generation with pretrained models in over 1100 languages and tools for training.
A transformer-based text-to-audio model that generates realistic multilingual speech, music, and sound effects.
A multi-voice text-to-speech system that produces highly realistic prosody and intonation using autoregressive and diffusion decoders.
An end-to-end speech processing toolkit for speech recognition, text-to-speech, translation, enhancement, and more.
A free, open-source singing synthesis editor designed as a modern successor to UTAU for creating vocal tracks.
Python library and CLI tool to interface with Google Translate's text-to-speech API for generating MP3 audio from text.
An open-source ChatGPT app with realistic voice capabilities using ElevenLabs text-to-speech.
A flow-based generative network for fast, high-quality speech synthesis from mel-spectrograms.
A Python library and CLI tool for converting text to phonetic transcriptions (phones) across multiple languages using various backends.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.