A versatile tool for generating, translating, and syncing subtitles from audio/video using Whisper and other AI models via Web UI, CLI, or Python.
Subs AI is an open-source subtitle generation tool that leverages AI models like OpenAI's Whisper to transcribe audio and video files into text with timestamps. It solves the problem of creating accurate subtitles manually by automating transcription, translation, and synchronization in a single package.
Content creators, video editors, developers, and researchers who need to generate or process subtitles for videos, podcasts, or other media, especially those preferring offline, self-hosted solutions.
Developers choose Subs AI for its flexibility in supporting multiple Whisper variants and backends, its offline-capable Web UI, and its comprehensive feature set including translation and sync tools—all without relying on proprietary services.
🎞️ Subtitles generation tool (Web-UI + CLI + Python package) powered by OpenAI's Whisper and its variants 🎞️
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Supports multiple Whisper backends including faster-whisper for efficiency and WhisperX for speaker diarization, offering flexibility in accuracy and speed trade-offs as detailed in the Features list.
The integrated Web UI runs entirely offline without third-party services, ensuring data privacy and eliminating internet dependency, as emphasized in the Web UI description.
Beyond transcription, includes translation using models like Facebook NLLB and auto-synchronization with ffsubsync, providing an end-to-end workflow for subtitle generation and editing.
CLI and Python interfaces allow efficient processing of multiple files via text file inputs or scripts, ideal for large media libraries, as shown in the CLI usage examples.
Requires manual installation of ffmpeg, specific Python versions (3.10-3.11), and potential GPU configuration, with noted compatibility issues for Python 3.12+ and torch detection problems in the Installation notes.
With multiple AI backends, dependency conflicts can arise, and users may need to edit requirements.txt for minimal installs, adding complexity and maintenance overhead as mentioned in the Python installation section.
Designed for offline batch processing, lacking support for live audio streaming or low-latency transcription, which limits use cases for applications needing immediate subtitle generation.