Showing 14 of 14 projects
A fast and elegant scraping and crawling framework for Go, designed for extracting structured data from websites.
An advanced XSS detection suite that uses context analysis and intelligent payload generation to find vulnerabilities.
A .NET port of the official Node.js Puppeteer API for headless browser automation.
A .NET port of the official Node.js Puppeteer API for headless browser automation.
A Python module to bypass Cloudflare's anti-bot page by solving JavaScript challenges using Node.js.
A preconfigured web crawler for backing up websites, producing WARC files with a live dashboard and dynamic ignore patterns.
A tidyverse package for web scraping in R, inspired by Beautiful Soup and designed for data extraction workflows.
A high-level web crawling and scraping framework for Elixir, designed for data extraction and processing.
A standalone Docker container for high-fidelity, browser-based web archiving crawls using Puppeteer and Brave.
A cross-platform website crawler and analyzer for SEO, security, accessibility, and performance optimization, built in Rust.
A Go web scraping framework that extracts structured data from websites using CSS selectors, including JavaScript-rendered pages.
WarcDB is an SQLite-based file format that makes web crawl data easier to share and query.
A fast, powerful, and extensible web crawling and scraping framework for Go, inspired by Scrapy.
A high-fidelity, user-scriptable archival web crawler using Chrome/Chromium to preserve JavaScript-rendered content.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.