Open-Awesome
CategoriesAlternativesStacksSelf-HostedExplore
Open-Awesome

© 2026 Open-Awesome. Curated for the developer elite.

TermsPrivacyAboutGitHubRSS
  1. Home
  2. Web Archiving
  3. Obelisk

Obelisk

MITGov0.91

A Go package and CLI tool that saves web pages as single HTML files with all assets embedded.

GitHubGitHub
313 stars26 forks0 contributors

What is Obelisk?

Obelisk is a Go package and command-line tool that archives web pages by downloading all their assets and embedding them into a single HTML file. It solves the problem of preserving web content in a portable, offline-friendly format that is easy to share and store. The tool is designed for speed, using concurrent downloads, and supports archiving pages that require authentication via cookies.

Target Audience

Developers and archivists who need to save web pages for offline use, documentation, or content preservation, especially those working in Go ecosystems or requiring programmatic access to web archival.

Value Proposition

Obelisk offers a fast, self-contained archival solution with concurrent downloads and cookie support, producing cleaner HTML output than some alternatives by inlining scripts and styles instead of relying solely on base64 encoding.

Overview

Go package and CLI tool for saving web page as single HTML file

Use Cases

Best For

  • Archiving web articles or documentation for offline reading
  • Saving authenticated pages (e.g., behind paywalls or logins) using cookie files
  • Generating portable HTML snapshots for legal or compliance records
  • Embedding web archival functionality into Go applications
  • Batch processing multiple URLs from a text file via CLI
  • Creating self-contained HTML files without external dependencies

Not Ideal For

  • Archiving JavaScript-heavy single-page applications that require client-side rendering after load
  • Large-scale distributed web crawling projects needing built-in rate limiting and CAPTCHA handling
  • Compliance archiving requiring standard formats like WARC for interoperability with other tools

Pros & Cons

Pros

Fast Concurrent Downloads

Downloads assets concurrently with configurable limits, significantly speeding up archival compared to sequential methods, as highlighted in the README for complex pages.

Self-Contained Output

Embeds all external resources like CSS, images, and JavaScript into a single HTML5 file using base64 data URLs or inline tags, ensuring archives are portable and offline-viewable without dependencies.

Cookie Authentication Support

Accepts Netscape cookie files via the --load-cookies flag, enabling archiving of pages behind logins or paywalls, a key feature for accessing restricted content.

Security by Default

Disables external requests via Content Security Policy by default, enhancing security and ensuring archives are truly self-contained without needing manual configuration.

Cons

Limited Dynamic Content Handling

By default, JavaScript is disabled and resources are embedded statically, which may break interactive elements or fail to capture content that loads dynamically via client-side rendering.

File Size Bloat for Media-Rich Pages

Embedding large images or videos as base64 data URLs can lead to massive HTML files, potentially making storage and sharing impractical for media-heavy sites despite inlining for scripts and styles.

Go-Centric Ecosystem

Primarily a Go package, so integration into non-Go environments relies on the CLI tool, which may lack the flexibility or advanced features of native solutions in other programming languages.

Frequently Asked Questions

Quick Stats

Stars313
Forks26
Contributors0
Open Issues8
Last commit3 months ago
CreatedSince 2020

Tags

#hacktoberfest#go-library#cli-tool#archive#golang#cli#web-archiving#offline-browsing#concurrent-downloads#go

Built With

G
Go

Included in

Web Archiving2.5k
Auto-fetched 1 day ago

Related Projects

ArchiveBoxArchiveBox

🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...

Stars27,580
Forks1,524
Last commit1 day ago
monolithmonolith

⬛️ CLI tool and library for saving complete web pages as a single HTML file

Stars15,130
Forks454
Last commit6 days ago
DiskerNetDiskerNet

💾 dn - offline full-text search and archiving for your Chromium-based browser.

Stars3,904
Forks148
Last commit2 months ago
HeritrixHeritrix

Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.

Stars3,228
Forks789
Last commit6 days ago
Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a projectStar on GitHub