Showing 32 of 32 projects
A high-performance C/C++ port of OpenAI's Whisper model for efficient, cross-platform speech recognition.
High-performance C/C++ port of OpenAI's Whisper for efficient, cross-platform speech recognition.
A fast, memory-efficient reimplementation of OpenAI's Whisper speech-to-text model using CTranslate2.
Fast automatic speech recognition with accurate word-level timestamps and speaker diarization, built on OpenAI's Whisper.
An offline desktop application for transcribing and translating audio/video files, live recordings, and YouTube links using OpenAI's Whisper.
Offline audio/video transcription desktop app using OpenAI Whisper with privacy-focused local processing.
A pipeline that combines OpenAI Whisper for speech-to-text with speaker diarization to identify who said what in audio.
A JAX implementation of OpenAI's Whisper model offering up to 70x faster transcription on TPUs.
A framework for building and deploying serverless decentralized applications on Ethereum, IPFS, and other blockchain platforms.
A Ruby client library for the OpenAI API, supporting GPT-5, Realtime WebRTC, and all major endpoints.
Standalone executables of OpenAI's Whisper and Faster-Whisper for speech-to-text transcription without Python dependencies.
A Python library that extends OpenAI's Whisper to provide accurate word-level timestamps and confidence scores for multilingual speech recognition.
An open-source ChatGPT app with realistic voice capabilities using ElevenLabs text-to-speech.
A Rust library for interacting with OpenAI's APIs with full async/await support and type-safe request/response handling.
A versatile tool for generating, translating, and syncing subtitles from audio/video using Whisper and other AI models via Web UI, CLI, or Python.
A Python tool that uses OpenAI's Whisper to automatically generate subtitle files for YouTube videos.
A macOS speech-to-text app offering on-device AI transcription, system-wide dictation, and AI text processing with full privacy.
A Neovim plugin for AI-powered chat sessions, text/code operations, speech-to-text, and image generation using multiple LLM providers.
A faster, memory-efficient command-line client for OpenAI's Whisper speech recognition, powered by CTranslate2.
Enables natural two-way voice conversations with Claude Code and other MCP agents, perfect for hands-free coding assistance.
A fast, accurate, and private native speech-to-text tool for Linux, offering system-wide dictation with local or cloud backends.
A React hook for OpenAI Whisper API with built-in speech recording, real-time transcription, and silence removal.
An Android Input Method Editor (IME) providing offline voice recognition and translation using the Whisper engine.
A video-language understanding framework that treats video narration as vocabulary and videos as long documents for efficient analysis.
A Google Colab notebook that transcribes YouTube videos using OpenAI's Whisper speech recognition model.
A command-line interface for blazingly fast audio transcription using optimized Whisper ASR models.
An open-source voice dictation tool that types your speech at the cursor in any application, powered by customizable AI transcription and formatting.
A Delphi wrapper for OpenAI, DeepSeek, Azure OpenAI, YandexGPT, Ollama, GigaChat, and Qwen APIs, enabling AI features in Delphi applications.
A fork of OpenAI's Whisper speech recognition models optimized with OpenVINO backend for faster CPU inference.
An open-source AI medical scribe that records patient encounters and generates structured clinical notes automatically.
A local, offline speech-to-text CLI tool that transcribes microphone input directly to your clipboard.
A browser extension that transcribes and summarizes in-browser conferences using ChatGPT and Whisper AI.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.