A command-line tool that provides jq-style access to structured data sources like SQL databases, CSV, and Excel files.
sq is a command-line data wrangling tool that treats diverse data sources—including SQL databases and document formats like CSV, Excel, and JSON—as queryable SQL databases. It enables users to inspect, query, join, and transform data across these sources with a unified interface, streamlining data operations without switching between different tools. The tool describes itself as the 'lovechild of sql+jq,' combining SQL's query power with jq's flexibility for structured data.
Data engineers, analysts, and developers who need to query, join, and transform data across multiple formats (e.g., SQL databases, CSV, Excel, JSON) from the command line. It is particularly useful for those performing data integration, migration, or inspection tasks in environments like Kubernetes, local development, or scripting pipelines.
Developers choose sq for its ability to perform cross-source joins (e.g., joining a CSV file with a Postgres table) and its support for multiple output formats (JSON, Excel, CSV, HTML, etc.), eliminating the need for separate tools. Its unique selling point is treating disparate data sources as unified SQL databases, offering a cohesive command-line experience with features like data diffing, result insertion into databases, and UNIX pipe integration.
sq data wrangler
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
sq enables joining data across disparate sources like CSV files and Postgres tables using a unified SQL interface, as demonstrated in the README with examples of multi-source joins.
It supports exporting query results to numerous formats including JSON, Excel, CSV, HTML, and Markdown, eliminating the need for separate conversion tools.
Commands like sq inspect and sq diff provide detailed metadata and comparison capabilities for schemas and row data, useful for migrations and validation tasks.
sq seamlessly pipes file-based sources such as CSV or Excel for on-the-fly processing, enhancing its utility in shell scripts and automation pipelines.
sq supports only a subset of databases (e.g., no Oracle or MongoDB), and drivers like ClickHouse are in beta, which may restrict its use in diverse environments.
Cross-source joins and operations on large datasets can incur performance penalties compared to native database tools, due to the abstraction layer and memory constraints.
As a command-line tool, sq lacks graphical interfaces for query building or data visualization, which might be a barrier for users accustomed to GUI-based ETL tools.