A sharp cut(1) clone with regex delimiters, column reordering, and automatic decompression for data exploration.
hck is a close drop-in replacement for the Unix cut command, enhanced with regex-based delimiters, flexible column selection, and automatic decompression. It fills the gap between cut and awk by making common data manipulation tasks—like reordering fields or splitting on complex patterns—simple and fast, ideal for interactive dataset exploration.
Developers, data engineers, and system administrators who frequently work with delimited text data (e.g., CSV, TSV, log files) in command-line environments and need more flexibility than cut but less complexity than awk.
Developers choose hck over alternatives because it combines the simplicity of cut with powerful enhancements like regex delimiters, column reordering, and automatic decompression, all while maintaining high performance for everyday data-wrangling tasks.
A sharp cut(1) clone.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Allows splitting on complex patterns like multiple spaces or custom regexes, eliminating the need for pre-processing with tools like tr, as shown in the examples with \s+ and multi-character delimiters.
Output columns can be specified in any order using intuitive cut-style syntax, making it easy to rearrange data for analysis without complex scripting.
Transparently handles compressed input files with common extensions (e.g., .gz, .bz2) via the -z flag, reducing manual steps and mimicking ripgrep's convenience.
Enables column selection by header name or regex with -F and -r flags, simplifying work with structured data files that have descriptive headers.
Like cut, hck does not respect CSV quoting rules, so it can incorrectly split data with embedded delimiters or quotes, limiting its use for standard CSV processing as noted in the non-goals.
Mixing by-index and by-header selections can lead to unintuitive output orders, as documented in the README with warnings about unexpected outcomes when fields are combined.
Limited to processing data line-by-line, so it cannot handle records that span multiple lines or delimiters containing newlines, restricting utility for certain log formats or complex data.