Open-Awesome
CategoriesAlternativesStacksSelf-HostedExplore
Open-Awesome

© 2026 Open-Awesome. Curated for the developer elite.

TermsPrivacyAboutGitHubRSS
  1. Home
  2. Web Archiving
  3. Archives Unleashed Notebooks

Archives Unleashed Notebooks

Apache-2.0Jupyter Notebook

Example notebooks for analyzing web archives using the Archives Unleashed Toolkit.

GitHubGitHub
26 stars5 forks0 contributors

Overview

Various examples of notebooks for working with web archives with the Archives Unleashed Toolkit, and derivatives generated by the Archives Unleashed Toolkit.

Quick Stats

Stars26
Forks5
Contributors0
Open Issues0
Last commit3 years ago
CreatedSince 2019

Tags

#web-archives#spark#research-tools#python3#data-visualization#historical-data#jupyter-notebooks#big-data#data-analysis#notebooks#digital-humanities

Built With

J
Jupyter
P
Python
A
Apache Spark

Included in

Web Archiving2.5k
Auto-fetched 13 hours ago

Related Projects

ArchiveSparkArchiveSpark

An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.

Stars161
Forks19
Last commit8 months ago
Archives Unleashed ToolkitArchives Unleashed Toolkit

The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.

Stars158
Forks34
Last commit6 months ago
Common Crawl Jupyter notebooksCommon Crawl Jupyter notebooks

Various Jupyter notebooks about Common Crawl data

Stars66
Forks11
Last commit7 months ago
Common Crawl Columnar IndexCommon Crawl Columnar Index

SQL-queryable index, with CDX info plus language classification. (Stable)

Stars0
Forks0
Last commit
Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a projectStar on GitHub