A native Golang ETL streaming parser and transform library for CSV, JSON, XML, EDI, text, and custom formats.
Omniparser is a native Golang ETL parser that ingests input data of various formats (CSV, fixed-length text, XML, EDI/X12/EDIFACT, JSON, and custom formats) in a streaming fashion and transforms it into desired JSON output based on a schema written in JSON. It addresses the lack of robust, streaming-capable ETL libraries in the Go ecosystem, offering a flexible solution for data processing pipelines.
Go developers building data processing pipelines that require streaming ingestion of multiple file formats without loading entire inputs into memory. It is particularly suited for engineers working with EDI, hierarchical data, or custom formats who need a schema-driven transformation tool.
Developers choose Omniparser because it provides a performant, extensible, and streaming-efficient ETL library specifically for Go, filling a gap where alternatives in other languages are limited, heavyweight, or lack streaming support. Its unique selling points include a JSON-based schema system, support for complex formats like EDI/X12/EDIFACT, and extensibility through custom functions and plugins.
omniparser: a native Golang ETL streaming parser and transform library for CSV, JSON, XML, EDI, text, etc.
Processes large CSV, XML, and EDI files without loading entire inputs into memory, addressing a key limitation in other libraries that lack streaming support.
Supports complex formats like EDI/X12/EDIFACT, which alternatives such as Jolt or JSONata cannot handle, making it unique in the Go ecosystem.
Allows adding custom functions, schema handlers, and new file formats via plugins, enabling adaptation to proprietary data sources as shown in the custom file format examples.
Includes a `javascript` custom function with full ES6 support, providing advanced data manipulation within schemas, as detailed in the custom functions documentation.
The online playground is non-functional, removing a valuable tool for testing schemas without local setup, forcing users to rely on CLI or code.
Writing schemas requires understanding XPath, custom functions, and nested JSON structures, leading to a steep learning curve, especially for hierarchical data.
Major updates have introduced incompatible schema changes, such as the shift to 'omni.2.1' with renamed fields, which can disrupt existing pipelines.
Fancy stream processing made operationally mundane
Data pipelines for cloud config and security data. Build cloud asset inventory, CSPM, FinOps, and vulnerability management solutions. Extract from AWS, Azure, GCP, and 70+ cloud and SaaS sources.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.