Showing 6 of 6 projects
A high-performance C/C++ port of OpenAI's Whisper model for efficient, cross-platform speech recognition.
A next-generation Kaldi-based toolkit for offline speech-to-text, text-to-speech, and audio processing across 12 languages and diverse hardware.
An open-source Python toolkit for speaker diarization with state-of-the-art pretrained models and pipelines.
A pipeline that combines OpenAI Whisper for speech-to-text with speaker diarization to identify who said what in audio.
A Python library that extends OpenAI's Whisper to provide accurate word-level timestamps and confidence scores for multilingual speech recognition.
A Swift SDK for fully local, low-latency audio AI on Apple devices, including transcription, text-to-speech, voice activity detection, and speaker diarization.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.