A high-performance, fully-featured CSV parser and serializer for modern C++ with streaming, random access, and robust format handling.
Vince's CSV Parser is a high-performance, feature-rich library for reading and writing CSV files in C++. It solves the problem of efficiently processing large datasets (including files larger than RAM) while providing a simple, intuitive API for common tasks like streaming, random access, numeric conversion, and JSON output. It handles various CSV dialects robustly and includes both streaming and in-memory data structures.
C++ developers working with data-intensive applications, such as data analysis pipelines, ETL processes, scientific computing, or any scenario requiring fast, reliable CSV parsing and serialization.
Developers choose this parser for its exceptional performance on large files, comprehensive feature set (including a DataFrame for random access), strict adherence to real-world CSV variations, and well-documented, intuitive API. It avoids unnecessary complexity while providing advanced capabilities like threading, memory-mapped I/O, and type-safe conversions.
A modern C++ CSV parser and serializer that doesn't make you choose between ease of use or performance.
Uses memory-mapped I/O and overlapped threading to parse multi-gigabyte files at speeds over 1 GB/s, even when files exceed RAM, as benchmarked with real datasets like the 1.4 GB Craigslist vehicles file.
Complies with RFC 4180 while supporting automatic delimiter guessing, variable column lengths, custom quoting, and trimming, adapting to real-world CSV dialects without manual tweaking.
Provides streaming iterators for large files with minimal memory footprint and an in-memory DataFrame for random access, updates, and grouping operations, catering to both streaming and analytical use cases.
Offers lazy numeric conversions with overflow protection and non-throwing try_get methods, plus support for hex and decimal parsing, ensuring data integrity without undefined behavior.
Limited to C++ projects with a minimum of C++11 and recommended C++17, and it requires exceptions enabled, which may not suit all environments or legacy systems.
CSVReader::iterator is an input iterator, not forward iterator, so algorithms like std::max_element require copying rows to a vector first—a non-obvious trap that can cause heap-use-after-free with large files.
Memory-mapped I/O, key to performance, doesn't work on all platforms (e.g., WebAssembly forces fallback to streams), and threading is auto-disabled in some builds, reducing throughput in constrained environments.
As a C++ library, it lacks direct integration with popular data science tools or frameworks outside C++, and its DataFrame is basic compared to full-fledged libraries like pandas in Python.
Extremely fast, in memory, serialization, reflection, and RPC library for C++. JSON, BEVE, BSON, CBOR, CSV, JSONB, MessagePack, TOML, YAML, EETF
fast-cpp-csv-parser
C++ CSV parser library
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.