Showing 12 of 12 projects
A deep learning toolkit for Text-to-Speech generation with pretrained models in over 1100 languages and tools for training.
Fast automatic speech recognition with accurate word-level timestamps and speaker diarization, built on OpenAI's Whisper.
A comprehensive open-source toolkit for speech recognition research and development.
A tiny JavaScript library for adding speech recognition and voice commands to websites.
A pipeline that combines OpenAI Whisper for speech-to-text with speaker diarization to identify who said what in audio.
An audio library for PyTorch providing data manipulation, transformations, and dataset loaders for machine learning applications.
A Python library that extends OpenAI's Whisper to provide accurate word-level timestamps and confidence scores for multilingual speech recognition.
Python library and CLI tool to interface with Google Translate's text-to-speech API for generating MP3 audio from text.
A Node.js library for adding voice interfaces with offline hotword detection and cloud speech recognition.
An Arduino library for text-to-speech synthesis using PWM or DAC outputs with external amplifier.
A React Native bridge for integrating Google Dialogflow (API.AI) SDK to build conversational interfaces in mobile apps.
A Python library for easy access, management, and processing of audio datasets, particularly for machine learning tasks.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.