DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

#on-device#embedded#deep-learning

Stars26.8k

Forks4.1k

Last commit1 year ago

faster-whisperPython

A fast, memory-efficient reimplementation of OpenAI's Whisper speech-to-text model using CTranslate2.

#transformer#ai#python-library

Stars24.5k

Forks2.0k

Last commit8 months ago

WhisperXPython

Fast automatic speech recognition with accurate word-level timestamps and speaker diarization, built on OpenAI's Whisper.

#forced-alignment#vad#automatic-speech-recognition

Stars23.2k

Forks2.4k

Last commit11 days ago

LeonTypeScript

🧠 Leon is your open-source personal assistant.

#ai#speech-recognition#text-to-speech

Stars17.4k

Forks1.5k

Last commit2 days ago

KaldiShell

A comprehensive open-source toolkit for speech recognition research and development.

#cuda#research-toolkit#speaker-id

Stars15.4k

Forks5.4k

Last commit10 months ago

VoskJupyter Notebook

Offline speech recognition toolkit supporting 20+ languages with small models and streaming API.

#ios#embedded-systems#speech-to-text-android

Stars15.0k

Forks1.7k

Last commit21 days ago

RTranslatorC++

An open-source Android app for real-time, offline voice translation between multiple languages using on-device AI models.

#bluetooth-le#open-source#machine-translation

An end-to-end speech processing toolkit for speech recognition, text-to-speech, translation, enhancement, and more.

#chainer#end-to-end#deep-learning

Stars9.9k

Forks2.4k

Last commit3 days ago

busterJavaScript

A browser extension that solves difficult CAPTCHAs by completing reCAPTCHA audio challenges using speech recognition.

#privacy-tools#browser-extension#recaptcha

Stars9.2k

Forks686

Last commit26 days ago

SpeechRecognitionPython

Speech recognition module for Python, supporting several engines and APIs, online and offline.

#speech-recognition#python#speech-to-text

Stars9.0k

Forks2.4k

Last commit1 month ago

annyangTypeScript

A tiny JavaScript library for adding speech recognition and voice commands to websites.

#web-accessibility#voice-commands#hands-free

Stars6.8k

Forks1.0k

Last commit9 days ago

Wav2Letter++C++

Facebook AI Research's automatic speech recognition toolkit for end-to-end ASR with modern neural architectures.

#asr-toolkit#end-to-end#neural-architectures

Stars6.4k

Forks989

Last commit10 days ago

wav2letterC++

A high-performance automatic speech recognition toolkit from Facebook AI Research, built with fully convolutional neural networks.

#end-to-end#deep-learning#automatic-speech-recognition

Stars6.4k

Forks989

Last commit10 days ago

VoiceInkSwift

A native macOS voice-to-text app that transcribes speech to text instantly with 100% offline processing.

#productivity-tool#ai-assistant#offline-transcription

Stars5.7k

Forks806

Last commit21 hours ago

whisper-diarizationJupyter Notebook

A pipeline that combines OpenAI Whisper for speech-to-text with speaker diarization to identify who said what in audio.

#nvidia-nemo#automatic-speech-recognition#asr

A highly-accurate, lightweight, on-device wake word detection engine powered by deep learning.

#iot#embedded-systems#on-device

Stars4.9k

Forks578

Last commit2 days ago

Whisper JAXJupyter Notebook

A JAX implementation of OpenAI's Whisper model offering up to 70x faster transcription on TPUs.

#parallel-computing#jax#deep-learning

Stars4.7k

Forks412

Last commit2 years ago

PocketSphinxC

A lightweight, open-source continuous speech recognition engine for embedded and offline applications.

#c-library#embedded-systems#python-library

Stars4.3k

Forks729

Last commit3 days ago

Warp-CTCCuda

A fast parallel implementation of the Connectionist Temporal Classification (CTC) loss function for CPU and GPU.

#cuda#parallel-computing#torch-binding

Stars4.1k

Forks1.0k

Last commit2 years ago

deep-chatTypeScript

A fully customizable AI chat component for websites, connecting to any API or hosting models directly in the browser.

#chat#ai#openai

Stars3.7k

Forks446

Last commit21 hours ago

whisper-standalone-win

Standalone executables of OpenAI's Whisper and Faster-Whisper for speech-to-text transcription without Python dependencies.

#media-processing#faster-whisper#asr

Stars3.1k

Forks164

Last commit8 months ago

whisper-timestampedPython

A Python library that extends OpenAI's Whisper to provide accurate word-level timestamps and confidence scores for multilingual speech recognition.

#subtitle-generation#python-library#deep-learning

A proof-of-concept system that defeats Google's audio reCaptcha with 85% accuracy using speech recognition and browser automation.

#web-security#selenium#captcha-bypass

Stars2.8k

Forks327

Last commit8 years ago

rhasspyShell

An open-source, fully offline voice assistant for many languages, designed for private home automation.

#multi-language#voice-commands#open-source

Stars2.7k

Forks209

Last commit1 year ago

FluidAudioSwift

A Swift SDK for fully local, low-latency audio AI on Apple devices, including transcription, text-to-speech, voice activity detection, and speaker diarization.

#ios#apple-neural-engine#speaker-embedding

Stars2.5k

Forks356

Last commit11 hours ago