A lightweight, header-only C++ tensor algebra framework delivering bare-metal performance for small matrix/tensor operations via compile-time optimizations.
Fastor is a lightweight, high-performance tensor algebra framework for modern C++. It provides a high-level interface for manipulating multi-dimensional arrays while delivering bare-metal performance for small matrix/tensor operations through compile-time optimizations and explicit SIMD vectorization. It solves the problem of achieving vendor-library performance in C++ without heavy dependencies or runtime overhead.
Scientific programmers, computational engineers, and developers working on performance-critical numerical applications in C++, especially those in fields like computational mechanics, physics simulations, and embedded systems.
Developers choose Fastor for its unique combination of a high-level, intuitive API with compile-time operation minimization and SIMD optimization, resulting in performance rivaling specialized libraries like MKL JIT, all in a lightweight, header-only package with zero dynamic allocations.
A lightweight high performance tensor algebra framework for modern C++
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Performs graph optimization and symbolic manipulations at compile time to reduce expression complexity, as shown in smart expression templates that transform inefficient operations like trace(matmul(transpose(A),B)) into inner(A,B).
Uses explicit SIMD vectorization for all numeric types with configurable backends like sleef and Vc, delivering performance on par with vendor libraries like MKL JIT for small matrices.
Has no external dependencies and fast compilation times, making it ideal for embedded systems and FPGAs with zero dynamic allocations, as stated in the features.
Provides intuitive syntax for tensor operations, including Einstein summation and powerful slicing with seq/fseq, similar to scientific programming languages like Python/NumPy.
Tensors must have fixed dimensions at compile time, limiting flexibility for applications with dynamically sized data that cannot be determined ahead of time.
Heavy reliance on C++ templates and compile-time techniques can lead to cryptic error messages and a steep learning curve, especially for developers unfamiliar with modern C++.
Optimized for small tensors that fit in cache; performance may degrade for very large tensors without optional JIT backends like MKL, which require additional setup.