Facebook AI Research's automatic speech recognition toolkit for end-to-end ASR with modern neural architectures.
wav2letter++ is Facebook AI Research's automatic speech recognition toolkit that implements end-to-end deep learning models for converting speech to text. It provides a framework for training and deploying state-of-the-art ASR systems using modern neural architectures like time-depth separable convolutions and ConvNets. The toolkit focuses on efficient, scalable speech recognition with support for both streaming and offline applications.
Speech recognition researchers, AI engineers, and developers building production ASR systems who need reproducible, high-performance implementations of cutting-edge speech recognition models.
Developers choose wav2letter++ for its research-grade implementations of published ASR papers, efficient end-to-end architectures, and comprehensive toolkit for both training and deployment. It offers pre-trained models and reproducible recipes that bridge research and production applications.
Facebook AI Research's Automatic Speech Recognition Toolkit
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Provides recipes and pre-trained models for multiple published ASR papers, such as those by Pratap et al. (2020) and Synnaeve et al. (2020), ensuring exact reproducibility with specified Flashlight versions.
Trains models directly from audio to text without intermediate phonetic representations, supporting lexicon-free recognition as highlighted in the key features for modern architectures.
Enables real-time speech recognition with low-latency ConvNet models, specifically in recipes like Pratap et al. (2020) for online ASR applications.
Implements state-of-the-art models like time-depth separable convolutions and ConvNets, based on research papers included in the repository for cutting-edge accuracy.
Requires specific versions of Flashlight and ArrayFire with nonstandard CMake flags, making setup tedious and error-prone for new users, as detailed in the building instructions.
The project has been consolidated into Flashlight, meaning this repository is not actively developed and future updates are elsewhere, potentially leaving users with outdated or unsupported code.
Focuses heavily on research reproducibility and training, lacking polished tools for easy deployment or integration into production systems, which may require significant custom engineering.