Showing 6 of 6 projects
Fast automatic speech recognition with accurate word-level timestamps and speaker diarization, built on OpenAI's Whisper.
A next-generation Kaldi-based toolkit for offline speech-to-text, text-to-speech, and audio processing across 12 languages and diverse hardware.
An open-source Python toolkit for speaker diarization with state-of-the-art pretrained models and pipelines.
A Python library for audio feature extraction, classification, segmentation, and machine learning applications.
A pipeline that combines OpenAI Whisper for speech-to-text with speaker diarization to identify who said what in audio.
A Swift SDK for fully local, low-latency audio AI on Apple devices, including transcription, text-to-speech, voice activity detection, and speaker diarization.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.