A Ruby gem for web scraping that extracts titles, meta tags, links, images, and structured data from URLs.
MetaInspector is a Ruby gem for web scraping that extracts structured data from web pages. It takes a URL and returns its title, meta tags, links, images, and other metadata, simplifying the process of gathering information from websites. It handles common scraping challenges like timeouts, redirects, and encoding issues.
Ruby developers who need to programmatically extract metadata, links, or images from websites for SEO analysis, content aggregation, or data mining projects.
Developers choose MetaInspector for its clean API, comprehensive feature set, and robust error handling, making it a reliable and easy-to-use alternative to building custom scrapers from scratch.
Ruby gem for web scraping purposes. It scrapes a given URL, and returns you its title, meta description, meta keywords, links, images...
Extracts a wide range of metadata including titles, Open Graph tags, author, and charset, providing a unified interface for SEO and social media analysis without manual parsing.
Encapsulates common scraping errors like timeouts and request failures into specific exceptions such as MetaInspector::TimeoutError, making failure handling more predictable and graceful.
Supports customizable timeouts, retries, redirect handling, and Faraday integration for advanced HTTP settings, allowing fine-tuned control over web requests.
Automatically normalizes URLs using the Addressable gem and can strip known tracking parameters, ensuring clean and consistent URL processing for scraping workflows.
Relies solely on static HTML parsing with Nokogiri, so it cannot scrape content generated or modified by JavaScript, limiting effectiveness on modern dynamic websites like SPAs.
Features like image size detection use the fastimage gem to download parts of images, adding network latency and slowing down scraping when enabled, especially for pages with many images.
Depends on multiple external gems like Nokogiri and Faraday, increasing project footprint and potential compatibility issues, which might be overkill for simple scraping tasks.
Mechanize is a ruby library that makes automated web interaction easy.
A batteries-included framework for easy web-scraping. Just add CSS! (Or do more.)
Lightweight Ruby web crawler/scraper with an elegant DSL which extracts structured data from pages.
Write web scrapers in Ruby using a clean, AI-assisted DSL. Kimurai uses AI to figure out where the data lives, then caches the selectors and scrapes with pure Ruby. Get the intelligence of an LLM without the per-request latency or token costs.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.