A lightweight, super fast C/C++ and Python library for sequence alignment using edit (Levenshtein) distance.
Edlib is a lightweight, high-performance library for calculating edit distance (Levenshtein distance) and finding optimal alignment paths between sequences. It provides exact alignment results using efficient algorithms like Myers's bit-vector approach, making it ideal for applications requiring fast and accurate sequence comparison.
Bioinformaticians, data scientists, and developers working on sequence analysis, text processing, or any domain requiring efficient string alignment and edit distance calculations.
Developers choose Edlib for its exceptional speed, minimal memory footprint, and support for multiple alignment modes, offering a robust open-source alternative to proprietary alignment tools without sacrificing performance.
Lightweight, super fast C/C++ (& Python) library for sequence alignment using edit (Levenshtein) distance.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Leverages Myers's bit-vector algorithm and Ukkonen's banded approach for super-fast edit distance calculations, ideal for large sequences.
Uses Hirschberg's algorithm to achieve linear space complexity, handling very large sequences with minimal memory consumption.
Supports global (NW), prefix (SHW), and infix (HW) alignment methods, providing versatility for different matching scenarios like read alignment in bioinformatics.
Allows definition of custom equality rules, such as wildcards or case-insensitive alignment, enabling specialized use cases.
Only supports single-byte characters, making it unsuitable for Unicode strings or alphabets larger than 256 without manual mapping, as noted in the README.
The included aligner application does not work on Windows due to dependency on getopt, limiting cross-platform utility for command-line users.
Focuses solely on edit distance without support for affine gap penalties or weighted mismatches, which are common in sophisticated bioinformatics pipelines.