A Haskell library for streaming data processing with constant memory usage, deterministic resource handling, and easy composition.
Conduit is a Haskell library for building streaming data pipelines. It solves the problem of processing large or infinite data streams with constant memory usage and deterministic resource cleanup, such as file handles or network connections. It provides a composable framework for transforming and consuming data from various sources.
Haskell developers building data-intensive applications, such as file processors, network servers, or data transformation tools that require efficient streaming and resource management.
Developers choose Conduit for its guarantee of constant memory usage, timely resource deallocation, and seamless composition of streaming components, which is harder to achieve with plain lists or manual implementations.
A streaming data library
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Processes large or infinite streams without loading entire datasets into memory, as demonstrated in examples summing numbers or copying files with sourceFile and sinkFile.
Uses runConduitRes with bracketP to ensure timely release of resources like file handles, preventing leaks even with exceptions, shown in file copying and directory traversal examples.
Allows seamless chaining of sources, transforms, and sinks using the .| operator, enabling complex data flows like yieldMany [1..10] .| mapC (* 2) .| sinkList.
Integrates side-effecting actions within pipelines while maintaining streaming, shown in the magic function example where effects are interleaved with data processing for efficient resource usage.
Requires understanding of ConduitT type parameters, monadic composition, and operators like .|, which can be challenging for newcomers, as highlighted in the terminology and concepts section.
Maintains deprecated operators like $=, =$, and =$= for backward compatibility, creating potential confusion and boilerplate when reading older code, as noted in the legacy syntax section.
Downstream-driven evaluation can lead to unexpected behavior if not carefully managed, such as data not being consumed when using takeC without forced consumption, requiring functions like takeExactlyC to avoid bugs.