Open-Awesome
CategoriesAlternativesStacksSelf-HostedExplore
Open-Awesome

© 2026 Open-Awesome. Curated for the developer elite.

TermsPrivacyAboutGitHubRSS
  1. Home
  2. Scientific Audio
  3. nnAudio

nnAudio

MITPythonv0.2.0

An audio processing toolbox using PyTorch 1D convolutional neural networks for on-the-fly spectrogram generation with trainable kernels.

GitHubGitHub
1.1k stars97 forks0 contributors

What is nnAudio?

nnAudio is an audio processing toolbox that uses PyTorch 1D convolutional neural networks to generate spectrograms and other audio features on-the-fly during neural network training. It solves the problem of integrating audio preprocessing into deep learning pipelines by making transformations differentiable and allowing kernels to be trainable, unlike traditional static audio libraries.

Target Audience

Researchers and developers working on audio deep learning projects, such as music information retrieval, speech processing, or sound classification, who need trainable and GPU-accelerated audio preprocessing within PyTorch.

Value Proposition

Developers choose nnAudio because it offers trainable audio kernels and seamless integration with PyTorch, enabling end-to-end differentiable audio processing that outperforms libraries like torchaudio in cross-platform compatibility and flexibility.

Overview

Audio processing by using pytorch 1D convolution network

Use Cases

Best For

  • Training neural networks with learnable audio feature extractors
  • Real-time spectrogram generation in GPU-accelerated pipelines
  • Music information retrieval tasks requiring CQT or VQT features
  • Cross-platform audio processing where torchaudio installation is problematic
  • Research projects needing differentiable audio transformations
  • Building end-to-end audio deep learning models in PyTorch

Not Ideal For

  • Projects using non-PyTorch frameworks like TensorFlow or JAX
  • Simple audio preprocessing tasks that don't require differentiability or GPU acceleration
  • Environments with limited GPU memory (e.g., under 2GB for full functionality)
  • Production systems needing guaranteed long-term maintenance and active support

Pros & Cons

Pros

Trainable Audio Kernels

Allows Fourier and CQT kernels to be trained as part of neural networks, enabling adaptive feature extraction unlike static libraries like torchaudio.

GPU-Accelerated Processing

Full GPU support for faster spectrogram generation during training, outperforming CPU-only tools like librosa in speed for deep learning pipelines.

Comprehensive Feature Set

Includes STFT, Mel, MFCC, CQT, VQT, Gammatone, and CFP features, offering a wide range of audio transformations in one toolbox.

Seamless PyTorch Integration

Differentiable operations enable end-to-end gradient flow, making it ideal for building and training neural networks with integrated audio preprocessing.

Cons

Maintenance Uncertainty

The README notes the original author lacks time for regular code review and seeks maintainers, risking stalled development and bug fixes.

High GPU Memory Demands

Unit tests require at least 1931 MiB GPU memory, which can be prohibitive for resource-constrained setups or small-scale experiments.

Incomplete Feature Inversion

Pending features like invertible CQT indicate some transformations aren't fully reversible yet, limiting applications needing precise reconstruction.

Frequently Asked Questions

Quick Stats

Stars1,123
Forks97
Contributors0
Open Issues21
Last commit4 months ago
CreatedSince 2019

Tags

#neural-network#spectrogram#music-information-retrieval#deep-learning#signal-processing#gpu-acceleration#audio-processing#convolutional-neural-networks#pytorch

Built With

l
librosa
N
NumPy
P
PyTorch
S
SciPy

Included in

Scientific Audio1.7k
Auto-fetched 1 day ago

Related Projects

TorchAudioTorchAudio

Data manipulation and transformation for audio signal processing, powered by PyTorch

Stars2,873
Forks770
Last commit2 days ago
KapreKapre

kapre: Keras Audio Preprocessors

Stars944
Forks148
Last commit6 months ago
Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a projectStar on GitHub