A high-performance GPU-accelerated Fast Fourier Transform library supporting Vulkan, CUDA, HIP, OpenCL, Level Zero, and Metal backends.
VkFFT is a GPU-accelerated Fast Fourier Transform library that enables high-performance multidimensional FFT computations across multiple GPU APIs, including Vulkan, CUDA, HIP, OpenCL, Level Zero, and Metal. It solves the need for a vendor-agnostic, open-source alternative to proprietary FFT libraries, offering better performance and broad hardware support. The library supports complex, real, and discrete cosine transforms with advanced algorithms for prime-length sequences.
Developers and researchers working on GPU-accelerated scientific computing, signal processing, or simulation projects that require high-performance FFT computations across diverse GPU hardware.
Developers choose VkFFT for its cross-API compatibility, performance advantages over proprietary libraries like cuFFT, and support for advanced features like in-place transforms, multiple precisions, and convolution operations. Its open-source nature and vendor-agnostic design make it ideal for projects targeting multiple GPU platforms.
Vulkan/CUDA/HIP/OpenCL/Level Zero/Metal Fast Fourier Transform library
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Supports Vulkan, CUDA, HIP, OpenCL, Level Zero, and Metal backends, enabling use on Nvidia, AMD, Intel, and Apple GPUs across Windows, Linux, and macOS, as highlighted in the README's multi-API support.
Performs all transformations in-place without performance loss, benchmarked to outperform cuFFT in double precision on GPUs like Nvidia A100, with provided plots showing superior global bandwidth.
Implements radix-2/3/4/5/7/8/11/13, Rader's, and Bluestein's algorithms for efficient handling of prime-length and arbitrary-dimensional FFTs, ensuring performance for non-power-of-two sequences.
Offers single, double, half, and quad precision with optimized real-to-complex transforms that run up to 2x faster and use less memory, catering to diverse numerical accuracy needs.
Half precision computations are done in single precision, only storing data in half precision, which may not fully leverage GPU capabilities for reduced precision workloads, as admitted in the README.
Lacks native support for splitting jobs across multiple GPUs; it's listed as an ambitious future feature, limiting scalability for large-scale distributed computations.
Requires manual backend configuration (e.g., VKFFT_BACKEND definitions) and dependencies like glslang for Vulkan or NVRTC for CUDA, increasing initial integration effort and potential for errors.
VkFFT is an open-source alternative to the following products: