Showing 22 of 58 projects
A fast, accurate, and private native speech-to-text tool for Linux, offering system-wide dictation with local or cloud backends.
A React hook for OpenAI Whisper API with built-in speech recording, real-time transcription, and silence removal.
A voice-based conversation interface for ChatGPT that allows users to speak and receive spoken responses.
A Node.js library for adding voice interfaces with offline hotword detection and cloud speech recognition.
A cross-platform desktop application that enables voice dictation in any text field using customizable keyboard shortcuts.
A customizable iOS overlay that handles voice permission and converts speech to text using native speech recognition.
An iOS app that uses ChatGPT to generate ARKit code from spoken prompts, placing and manipulating 3D objects in augmented reality.
A Google Colab notebook that transcribes YouTube videos using OpenAI's Whisper speech recognition model.
An open-source voice dictation tool that types your speech at the cursor in any application, powered by customizable AI transcription and formatting.
An Android overlay that handles voice permission and converts user speech to text with a customizable UI.
A JavaScript library for adding IBM Watson Speech to Text and Text to Speech capabilities to web applications.
An open-source desktop app that transcribes voice to polished text using AI and types it into any application.
A React Native bridge for integrating Google Dialogflow (API.AI) SDK to build conversational interfaces in mobile apps.
An open-source AI medical scribe that records patient encounters and generates structured clinical notes automatically.
A high-performance real-time voice processing server in Rust providing unified STT/TTS services via WebSocket and REST APIs.
A Go client library for interacting with the Wit.ai natural language processing HTTP API.
A .NET Standard library for accessing IBM Watson cognitive services like Assistant, Discovery, and Speech-to-Text.
An application that uses IBM Watson AI services and Cloud Functions to analyze videos, extracting visual and audio insights for search and categorization.
A local, offline speech-to-text CLI tool that transcribes microphone input directly to your clipboard.
A shell wrapper for interacting with multiple AI service providers including OpenAI, LocalAI, Ollama, Gemini, and Anthropic via chat, text, and speech endpoints.
A browser extension that transcribes and summarizes in-browser conferences using ChatGPT and Whisper AI.
A collection of Node-RED nodes to integrate IBM Watson AI services like speech, language, and conversation into applications.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.