A fast command-line toolkit for indexing, slicing, analyzing, splitting, and joining CSV files, written in Rust.
xsv is a fast CSV command-line toolkit written in Rust that provides indexing, slicing, joining, and analysis operations for CSV files. It solves the problem of slow or inadequate CSV tools when dealing with large datasets, offering performance optimizations like constant-time indexing and hash-based joins.
Data engineers, analysts, and developers who need to efficiently inspect, transform, or analyze large CSV files from the command line, especially when working with multi-gigabyte datasets.
Developers choose xsv for its speed and composability—it exposes performance trade-offs in the CLI and maintains efficiency even when combining commands, making it uniquely suited for handling substantial CSV data where other tools falter.
A fast CSV command line toolkit written in Rust.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Indexing enables constant-time row slicing and accelerates operations like statistics; the README demonstrates slicing 10 rows from a 3-million record file is instantaneous after creating an index.
Joins use a simple hash index for speed, supporting inner, outer, and cross joins on large files in seconds, as shown in the tour with worldcitiespop.csv and countrynames.csv.
Commands are built to be piped together without performance loss, allowing complex workflows like filtering, selecting, and sampling in a single chain, as emphasized in the philosophy.
Statistics and frequency commands leverage parallelism when an index is present, cutting analysis time significantly; the tour shows stats running in 8 seconds vs. 12 seconds without indexing.
The project is marked as unmaintained in the README, with no updates, bug fixes, or support, and the author recommends alternatives like qsv or xan, making it risky for production use.
Lacks a GUI or interactive shell, which can hinder exploratory data analysis for users accustomed to visual tools or those needing more intuitive data manipulation beyond terminal commands.
Designed solely for CSV files, it cannot natively handle other common data formats like JSON, databases, or streaming inputs, requiring additional conversion steps in mixed-format workflows.