Question 1

How to set up Crawl4AI with Docker?

Accepted Answer

Pull the Docker image with `docker pull unclecode/crawl4ai:latest`, run it with exposed ports, and access the monitoring dashboard at localhost:11235/dashboard for real-time metrics and browser pool management.

Question 2

Crawl4AI vs Scrapy for AI projects?

Accepted Answer

Crawl4AI is built for AI workflows with built-in LLM-ready Markdown and browser control, while Scrapy is a general-purpose framework requiring extra processing for AI use. Choose Crawl4AI for direct AI integration and Scrapy for highly customized, large-scale scraping.

Question 3

How to extract structured data without using LLMs?

Accepted Answer

Use the JsonCssExtractionStrategy with a custom JSON schema defining CSS selectors; Crawl4AI will extract data into structured JSON, as shown in the advanced examples with fields like course names and descriptions.

Question 4

Does Crawl4AI handle websites with CAPTCHA?

Accepted Answer

No, it doesn't solve CAPTCHA natively but recommends integrating third-party services like CapSolver for reCAPTCHA and other challenges, requiring additional setup and compliance considerations.

Question 5

How to use Crawl4AI with custom proxies?

Accepted Answer

Configure proxies in BrowserConfig or CrawlerRunConfig using ProxyConfig objects, supporting multiple proxies with authentication for secure access, as detailed in the anti-bot detection examples.

Question 6

What are the system requirements for running Crawl4AI?

Accepted Answer

Requires Python 3.10+, Playwright browsers installed, and sufficient memory for browser instances. Docker deployments need at least 1GB SHM and resources for concurrent crawling, especially for deep crawls.

crawl4ai

What is crawl4ai?

Overview

Use Cases

Best For

Related Projects

Found a gem we're missing?

Not Ideal For

Pros & Cons

Pros

Cons

Frequently Asked Questions