An audio processing toolbox using PyTorch 1D convolutional neural networks for on-the-fly spectrogram generation with trainable kernels.
nnAudio is an audio processing toolbox that uses PyTorch 1D convolutional neural networks to generate spectrograms and other audio features on-the-fly during neural network training. It solves the problem of integrating audio preprocessing into deep learning pipelines by making transformations differentiable and allowing kernels to be trainable, unlike traditional static audio libraries.
Researchers and developers working on audio deep learning projects, such as music information retrieval, speech processing, or sound classification, who need trainable and GPU-accelerated audio preprocessing within PyTorch.
Developers choose nnAudio because it offers trainable audio kernels and seamless integration with PyTorch, enabling end-to-end differentiable audio processing that outperforms libraries like torchaudio in cross-platform compatibility and flexibility.
Audio processing by using pytorch 1D convolution network
Allows Fourier and CQT kernels to be trained as part of neural networks, enabling adaptive feature extraction unlike static libraries like torchaudio.
Full GPU support for faster spectrogram generation during training, outperforming CPU-only tools like librosa in speed for deep learning pipelines.
Includes STFT, Mel, MFCC, CQT, VQT, Gammatone, and CFP features, offering a wide range of audio transformations in one toolbox.
Differentiable operations enable end-to-end gradient flow, making it ideal for building and training neural networks with integrated audio preprocessing.
The README notes the original author lacks time for regular code review and seeks maintainers, risking stalled development and bug fixes.
Unit tests require at least 1931 MiB GPU memory, which can be prohibitive for resource-constrained setups or small-scale experiments.
Pending features like invertible CQT indicate some transformations aren't fully reversible yet, limiting applications needing precise reconstruction.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.