A fast, memory-efficient HTML5 parser with CSS selector support for Crystal, successor to myhtml.
Lexbor is a fast and memory-efficient HTML5 parser with CSS selector support for the Crystal programming language. It is designed as a successor to myhtml, offering improved performance and lower memory usage for parsing and querying HTML documents. The library enables developers to parse HTML, navigate the DOM using CSS selectors, and extract or manipulate data efficiently.
Crystal developers who need to parse HTML, such as those building web scrapers, crawlers, or tools that process web content. It is also suitable for developers migrating from myhtml seeking better performance.
Lexbor stands out for its speed and low memory consumption, outperforming alternatives like libxml2-based parsers. Its familiar API and CSS selector support make it both powerful and easy to use for HTML processing tasks.
Fast HTML5 Parser with CSS selectors. This is successor of myhtml and expected to be faster and use less memory.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Benchmarks show Lexbor parses HTML significantly faster than libxml2-based alternatives, with a 4.75-second parse time for 1000 iterations versus 14.20 seconds for Crystagiri on Ryzen 3800x.
Uses minimal memory, consuming only 12.3 MiB in tests compared to 398.4 MiB for Nokogiri, making it ideal for memory-constrained environments like web scraping or embedded systems.
Provides robust CSS selector capabilities for easy DOM traversal, as demonstrated in examples with queries like '#t2 tr td:first-child' for intuitive data extraction.
Written in Crystal, it offers seamless integration with compile-time type checks, reducing errors and improving code reliability in Crystal projects.
Relies on the lexbor C library, requiring compilation steps like running 'crystal src/ext/build_ext.cr', which complicates setup and cross-platform deployment compared to pure Crystal shards.
Only available for Crystal, so it's unusable in projects using other languages like Ruby or Python, restricting its adoption and ecosystem support.
The README provides only basic examples and lacks comprehensive API documentation, forcing developers to explore source code or community resources for advanced features.