Showing 3 of 3 projects
An Apache Spark framework for efficient data processing, extraction, and derivation from web archives and archival collections.
An open-source toolkit for analyzing web archives at scale using Apache Spark.
A Node.js library for parsing and creating Web ARChive (WARC) files with support for Chrome, Puppeteer, and Electron.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.