Showing 28 of 28 projects
A jQuery-like HTML manipulation and traversal library for Go, built on net/html and cascadia CSS selectors.
A scalable Java framework for building web crawlers, covering downloading, URL management, content extraction, and persistence.
A Python library and CLI tool for web crawling, scraping, and extracting main text, metadata, and comments from web pages.
A fast, spec-compliant HTML parsing and serialization toolset for Node.js.
A sensible XML and HTML parsing library for iOS and macOS, offering a modern DOM-style API with XPath and CSS query support.
A Rust library for parsing HTML and querying elements using CSS selectors.
A PHP library that converts HTML to Markdown with configurable options for clean, editable output.
A tidyverse package for web scraping in R, inspired by Beautiful Soup and designed for data extraction workflows.
A Swift library to build NSAttributedString from HTML-like text with clickable tags, links, hashtags, and mentions.
A lightweight Ruby web crawler and scraper with an elegant DSL for extracting structured data from web pages.
A fast and lightweight XML/HTML parser for Swift with XPath and CSS query support.
A Ruby gem for web scraping that extracts titles, meta tags, links, images, and structured data from URLs.
A Rust library for extracting structured data from HTML documents, designed for web scraping tasks.
A fast Ruby XML parser, object marshaller, and SAX parser designed as a high-performance alternative to Nokogiri and Marshal.
A monorepo of utilities for importing and exporting DraftJS ContentState to and from HTML and Markdown.
F# type providers and utilities for accessing structured data formats (CSV, HTML, JSON, XML) and WorldBank data.
F# library providing type providers and helpers for accessing CSV, JSON, XML, HTML, and WorldBank data.
A command-line tool to extract data from HTML/XML pages and JSON APIs using CSS, XPath, XQuery, JSONiq, and pattern matching.
A Go package for querying HTML documents using XPath expressions with built-in caching for performance.
A Go package for querying XML, HTML, and JSON documents using XPath expressions.
A Clojure/ClojureScript library that parses HTML into Clojure data structures for analysis, transformation, and serialization.
A Swift headless browser based on WebKit for functional testing and webpage manipulation via JavaScript.
A Rust library for parsing and generating documents across 13+ formats using a unified Common Document Model.
An Elixir library for extracting and curating the primary readable content from webpages.
XPath/XQuery 3.1 interpreter for Pascal with HTTP/S, JSON, HTML, and web scraping capabilities.
Automatically insert live Angular components into dynamic strings or HTML structures using selectors or custom patterns.
A library for writing, parsing, and manipulating HTML markup as first-class Elixir data structures.
A webpack loader that enables seamless integration of web components (Polymer, x-tags) with hot code reload support.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.