Showing 8 of 8 projects
A jQuery-like HTML manipulation and traversal library for Go, built on net/html and cascadia CSS selectors.
A scalable Java framework for building web crawlers, covering downloading, URL management, content extraction, and persistence.
A Python library and CLI tool for web crawling, scraping, and extracting main text, metadata, and comments from web pages.
A fast, spec-compliant HTML parsing and serialization toolset for Node.js.
A sensible XML and HTML parsing library for iOS and macOS, offering a modern DOM-style API with XPath and CSS query support.
A Rust library for parsing HTML and querying elements using CSS selectors.
A PHP library that converts HTML to Markdown with configurable options for clean, editable output.
A tidyverse package for web scraping in R, inspired by Beautiful Soup and designed for data extraction workflows.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.