Showing 36 of 59 projects
High-performance C/C++ port of OpenAI's Whisper for efficient, cross-platform speech recognition.
A high-performance C/C++ port of OpenAI's Whisper model for efficient, cross-platform speech recognition.
A fast, memory-efficient reimplementation of OpenAI's Whisper speech-to-text model using CTranslate2.
Fast automatic speech recognition with accurate word-level timestamps and speaker diarization, built on OpenAI's Whisper.
An offline desktop application for transcribing and translating audio/video files, live recordings, and YouTube links using OpenAI's Whisper.
An open-source AI memory tool that captures your screen and audio locally, enabling search and automation agents based on your computer activity.
Open-source AI platform for building private agents, assistants, and enterprise search with document analysis and multi-model support.
A comprehensive open-source toolkit for speech recognition research and development.
Offline speech recognition toolkit supporting 20+ languages with small models and streaming API.
A next-generation Kaldi-based toolkit for offline speech-to-text, text-to-speech, and audio processing across 12 languages and diverse hardware.
SDKs for adding private, on-device AI features like LLM chat, speech-to-text, and text-to-speech to mobile and web apps.
A tiny JavaScript library for adding speech recognition and voice commands to websites.
A hackable open-source voice assistant platform for building and running custom voice-controlled applications.
Facebook AI Research's automatic speech recognition toolkit for end-to-end ASR with modern neural architectures.
A high-performance automatic speech recognition toolkit from Facebook AI Research, built with fully convolutional neural networks.
Offline audio/video transcription desktop app using OpenAI Whisper with privacy-focused local processing.
A pipeline that combines OpenAI Whisper for speech-to-text with speaker diarization to identify who said what in audio.
A JAX implementation of OpenAI's Whisper model offering up to 70x faster transcription on TPUs.
A Ruby client library for the OpenAI API, supporting GPT-5, Realtime WebRTC, and all major endpoints.
Standalone executables of OpenAI's Whisper and Faster-Whisper for speech-to-text transcription without Python dependencies.
A Python library that extends OpenAI's Whisper to provide accurate word-level timestamps and confidence scores for multilingual speech recognition.
A curated list of resources, tools, and applications for OpenAI's Whisper speech recognition model.
A curated list of resources, tools, and applications for OpenAI's Whisper speech recognition system.
A web service providing a GUI and API with queuing for OpenAI Whisper transcription and translation.
A robust yet lenient forced aligner built on Kaldi for aligning speech audio with text transcripts.
A versatile tool for generating, translating, and syncing subtitles from audio/video using Whisper and other AI models via Web UI, CLI, or Python.
A free web application that transcribes and translates audio files using OpenAI's Whisper and Chat APIs.
A Linux desktop app for offline note-taking, reading, and translation using speech-to-text, text-to-speech, and machine translation.
Node.js client library for accessing IBM Watson AI services like Assistant, Speech-to-Text, and Natural Language Understanding.
A Python client library for interacting with IBM Watson AI services, available via pip as ibm-watson.
A Python tool that uses OpenAI's Whisper to automatically generate subtitle files for YouTube videos.
A macOS speech-to-text app offering on-device AI transcription, system-wide dictation, and AI text processing with full privacy.
A Neovim plugin for AI-powered chat sessions, text/code operations, speech-to-text, and image generation using multiple LLM providers.
A faster, memory-efficient command-line client for OpenAI's Whisper speech recognition, powered by CTranslate2.
Enables natural two-way voice conversations with Claude Code and other MCP agents, perfect for hands-free coding assistance.
Node.js sample applications demonstrating IBM Watson Speech to Text service features for converting speech to text.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.