An extremely fast query engine for DataFrames, written in Rust, with multi-language frontends.
Polars is an extremely fast query engine for DataFrames, written in Rust. It is designed for analytical data processing, offering high performance through multi-threading, SIMD, and query optimization. It supports both eager and lazy execution, as well as streaming for datasets larger than RAM.
Data engineers, data scientists, and developers who need high-performance DataFrame operations in Python, Rust, Node.js, or R, especially for large-scale data analysis.
Developers choose Polars for its exceptional speed, memory efficiency, and multi-language support, making it a powerful alternative to traditional DataFrame libraries like pandas for performance-critical applications.
Extremely fast Query Engine for DataFrames, written in Rust
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Polars ranks among the top in PDS-H benchmarks, leveraging Rust and SIMD for blazing-fast data operations.
Supports streaming execution for datasets larger than RAM, enabling processing of 250GB data on a laptop with `collect(engine='streaming')`.
Offers APIs in Python, Rust, Node.js, R, and SQL, making it adaptable to diverse development stacks.
Has zero required dependencies and faster import times (70ms) compared to pandas (520ms), reducing overhead.
Automatically optimizes queries in lazy execution mode, improving performance without manual tuning.
Maximal performance requires compiling from source with Rust and maturin, involving multiple build options and longer compile times, as detailed in the README.
Special installations like `polars[rtcompat]` for old CPUs or `polars[rt64]` for large row counts add version fragmentation and potential performance trade-offs, as noted in the legacy and big index sections.
Compared to pandas, Polars has fewer third-party integrations and community resources, which can hinder adoption for specialized use cases.
Scaling to distributed clusters requires Polars' managed cloud offering, introducing vendor dependency and potential lock-in, as mentioned in the managed/distributed section.