Speech To Text

79 projects

Showing 36 of 79 projects

A Neovim plugin for AI-powered chat sessions, text/code operations, speech-to-text, and image generation using multiple LLM providers.

#ai-assistant#codeium#vim

Stars1.3k

Forks128

Last commit11 months ago

VoiceMode MCPPython

Enables natural two-way voice conversations with Claude Code and other MCP agents, perfect for hands-free coding assistance.

#tts#developer-tools#claudecode

Stars1.3k

Forks180

Last commit2 days ago

speech-to-text-nodejsJavaScript

Node.js sample applications demonstrating IBM Watson Speech to Text service features for converting speech to text.

#sample-app#rest-api#websocket

Stars1.1k

Forks695

Last commit3 years ago

hyprwhsprPython

A fast, accurate, and private native speech-to-text tool for Linux, offering system-wide dictation with local or cloud backends.

#ai#wayland#archlinux

Stars1.1k

Forks84

Last commit1 day ago

use-whisperTypeScript

A React hook for OpenAI Whisper API with built-in speech recording, real-time transcription, and silence removal.

#api#hook#openai

Stars785

Forks139

Last commit2 years ago

chatgpt-conversationPython

A voice-based conversation interface for ChatGPT that allows users to speak and receive spoken responses.

#conversational-ai#real-time-communication#text-to-speech

A Node.js library for adding voice interfaces with offline hotword detection and cloud speech recognition.

#voice-commands#stt#voice-control

Stars638

Forks76

Last commit2 years ago

Ito AITypeScript

A cross-platform desktop application that enables voice dictation in any text field using customizable keyboard shortcuts.

#desktop-application#ai#productivity

Stars561

Forks112

Last commit6 months ago

Voice OverlaySwift

A customizable iOS overlay that handles voice permission and converts speech to text using native speech recognition.

#search#ios#input

Stars556

Forks58

Last commit25 days ago

ChatARKitC

An iOS app that uses ChatGPT to generate ARKit code from spoken prompts, placing and manipulating 3D objects in augmented reality.

#ios#arkit#natural-language-processing

Stars442

Forks35

Last commit3 years ago

YouTube Video TranscriptionJupyter Notebook

A Google Colab notebook that transcribes YouTube videos using OpenAI's Whisper speech recognition model.

#youtube-transcription#transformer#google-colab

Stars421

Forks115

Last commit2 years ago

tambourine-voiceRust

An open-source voice dictation tool that types your speech at the cursor in any application, powered by customizable AI transcription and formatting.

#productivity#pipecat#desktop-app

An open-source desktop app that transcribes voice to polished text using AI and types it into any application.

#desktop-application#ai#deepgram

Stars354

Forks64

Last commit2 days ago

VoiceOverlayKotlin

An Android overlay that handles voice permission and converts user speech to text with a customizable UI.

#conversational-ui#input#ui-overlay

Stars264

Forks36

Last commit4 years ago

speech-javascript-sdkJavaScript

A JavaScript library for adding IBM Watson Speech to Text and Text to Speech capabilities to web applications.

#browser-sdk#text-to-speech#watson-api

Stars263

Forks133

Last commit5 months ago

react-native-dialogflowJavaScript

A React Native bridge for integrating Google Dialogflow (API.AI) SDK to build conversational interfaces in mobile apps.

#dialogflow#speech-to-function#speak

Stars205

Forks60

Last commit3 years ago

OpenScribeTypeScript

An open-source AI medical scribe that records patient encounters and generates structured clinical notes automatically.

#ai#medical-ai#clinical-informatics

Stars193

Forks44

Last commit2 months ago

SaynaRust

A high-performance real-time voice processing server in Rust providing unified STT/TTS services via WebSocket and REST APIs.

#livekit-integration#rust-server#voice-processing

Stars183

Forks29

Last commit1 month ago

wit-goGo

A Go client library for interacting with the Wit.ai natural language processing HTTP API.

#intent-recognition#go-client#entity-extraction

Stars170

Forks38

Last commit10 months ago

dotnet-standard-sdkC#

A .NET Standard library for accessing IBM Watson cognitive services like Assistant, Discovery, and Speech-to-Text.

#hacktoberfest#cloud-ai#natural-language-processing

A local, offline speech-to-text CLI tool that transcribes microphone input directly to your clipboard.

#developer-tools#faster-whisper#cli-tool

Stars110

Forks14

Last commit1 month ago

openwhisk-darkvisionappJavaScript

An application that uses IBM Watson AI services and Cloud Functions to analyze videos, extracting visual and audio insights for search and categorization.

#ibm-cloud#watson-ai#watson-visual-recognition

A shell wrapper for interacting with multiple AI service providers including OpenAI, LocalAI, Ollama, Gemini, and Anthropic via chat, text, and speech endpoints.

#tts#grok#localai

Stars92

Forks7

Last commit1 month ago