Question 1

How does Scoop capture videos from web pages?

Accepted Answer

Scoop optionally extracts videos as attachments using yt-dlp outside the browser, capturing them with associated subtitles and metadata, but this requires python3 and can be disabled.

Question 2

Scoop vs Browsertrix: which is better for web archiving?

Accepted Answer

Scoop excels at high-fidelity single-page captures with provenance and signing, while Browsertrix is geared towards scalable, multi-page crawls; choose based on need for detail vs breadth.

Question 3

How to sign WACZ files with Scoop for authenticity?

Accepted Answer

Use the --signing-url CLI option or programmatically pass a signing endpoint and token to Scoop.toWACZ(), which implements the WACZ Signing specification for cryptographic verification.

Question 4

Can Scoop handle websites with heavy JavaScript?

Accepted Answer

Yes, its browser-based approach using Chromium ensures dynamic JavaScript content is fully rendered, but timeouts may need adjustment for complex pages to avoid incomplete captures.

Question 5

What system dependencies does Scoop need to run?

Accepted Answer

Beyond Node.js 18+, Scoop recommends curl and python3 for features like video capture, and requires Playwright's Chromium dependencies, which can be installed via npx playwright install-deps.

Question 6

How to run Scoop in headful mode on a server?

Accepted Answer

Use xvfb-run with the --headless false flag to simulate a display, as noted in the FAQ, but this adds complexity and may impact performance in headless environments.

Question 7

Is Scoop good for archiving social media pages?

Accepted Answer

It can capture single pages with high fidelity, but may struggle with infinite scroll or login-walled content, and resource limits might truncate large media-heavy captures.

Scoop

What is Scoop?

Overview

Use Cases

Best For

Related Projects

Found a gem we're missing?

Not Ideal For

Pros & Cons

Pros

Cons

Frequently Asked Questions