diff --git a/README.md b/README.md index 5b4b21782..cf79d1b45 100644 --- a/README.md +++ b/README.md @@ -11,7 +11,7 @@ npm install -g firecrawl-cli Or set up everything in one command (install CLI globally, authenticate, and add skills across all detected coding editors): ```bash -npx -y firecrawl-cli@latest init -y --browser +npx -y firecrawl-cli@1.13.0 init -y --browser ``` - `-y` runs setup non-interactively @@ -475,80 +475,53 @@ firecrawl agent abc123-def456-... --wait --poll-interval 10 --- -### `browser` - Browser sandbox sessions (Deprecated) +### `interact` - Interact with scraped pages -> **Deprecated:** Prefer `scrape` + `interact` instead. Interact lets you scrape a page and then click, fill forms, and navigate without managing sessions manually. See the `interact` command. - -Launch and control cloud browser sessions. By default, commands are sent to agent-browser (pre-installed in every sandbox). Use `--python` or `--node` to run Playwright code directly instead. +Scrape a page, then interact with it in a live browser session using natural language or code. No manual session management required. ```bash -# 1. Launch a session -firecrawl browser launch --stream - -# 2. Execute agent-browser commands (default) -firecrawl browser execute "open https://example.com" -firecrawl browser execute "snapshot" -firecrawl browser execute "click @e5" -firecrawl browser execute "scrape" +# 1. Scrape a page first +firecrawl scrape https://example.com -# 3. Execute Playwright Python or JavaScript -firecrawl browser execute --python "await page.goto('https://example.com'); print(await page.title())" -firecrawl browser execute --node "await page.goto('https://example.com'); await page.title()" +# 2. Interact with it +firecrawl interact "Click the pricing tab" +firecrawl interact "Fill in the email field with test@example.com" +firecrawl interact "Extract the pricing table" -# 4. List sessions -firecrawl browser list +# 3. Code execution (Playwright) +firecrawl interact -c "await page.title()" +firecrawl interact -c "print(await page.title())" --python -# 5. Close -firecrawl browser close +# 4. Stop the session +firecrawl interact stop ``` -#### Launch Options - -| Option | Description | -| ---------------------------- | ------------------------------------------- | -| `--ttl ` | Total session TTL in seconds (default: 300) | -| `--ttl-inactivity ` | Inactivity TTL in seconds | -| `--stream` | Enable live view streaming | -| `-o, --output ` | Save output to file | -| `--json` | Output as JSON format | - -#### Execute Options +#### Interact Options -| Option | Description | -| --------------------- | ------------------------------------------------------------------ | -| `--python` | Execute as Playwright Python code | -| `--node` | Execute as Playwright JavaScript code | -| `--bash` | Execute bash commands in the sandbox (agent-browser pre-installed) | -| `--session ` | Target a specific session (auto-saved on launch) | -| `-o, --output ` | Save output to file | -| `--json` | Output as JSON format | +| Option | Description | +| ---------------------- | ---------------------------------------------- | +| `-p, --prompt ` | AI prompt (alternative to positional argument) | +| `-c, --code ` | Code to execute in the browser sandbox | +| `-s, --scrape-id ` | Scrape job ID (default: last scrape) | +| `--python` | Execute code as Python/Playwright | +| `--node` | Execute code as Node.js/Playwright (default) | +| `--bash` | Execute code as Bash | +| `--timeout ` | Timeout in seconds (1-300, default: 30) | +| `-o, --output ` | Save output to file | +| `--json` | Output as JSON format | -By default (no flag), commands are sent to agent-browser. `--python`, `--node`, and `--bash` are mutually exclusive. +#### Profiles -#### Examples +Use `--profile` on the scrape to persist browser state across scrapes: ```bash -# agent-browser commands (default mode) -firecrawl browser execute "open https://example.com" -firecrawl browser execute "snapshot" -firecrawl browser execute "click @e5" -firecrawl browser execute "fill @e3 'search query'" -firecrawl browser execute "scrape" - -# Playwright Python -firecrawl browser execute --python "await page.goto('https://example.com'); print(await page.title())" - -# Playwright JavaScript -firecrawl browser execute --node "await page.goto('https://example.com'); await page.title()" - -# Bash (arbitrary commands in the sandbox) -firecrawl browser execute --bash "ls /tmp" - -# Launch with extended TTL -firecrawl browser launch --ttl 900 --ttl-inactivity 120 +# Login and save state +firecrawl scrape "https://app.example.com/login" --profile my-app +firecrawl interact "Fill in email and click login" -# JSON output -firecrawl browser execute --json "snapshot" +# Come back authenticated later +firecrawl scrape "https://app.example.com/dashboard" --profile my-app +firecrawl interact "Extract the dashboard data" ``` --- @@ -610,7 +583,7 @@ firecrawl --status ``` ``` - 🔥 firecrawl cli v1.4.0 + 🔥 firecrawl cli v1.13.0 ● Authenticated via stored credentials Concurrency: 0/100 jobs (parallel scrape limit) @@ -688,83 +661,6 @@ firecrawl scrape https://example.com -o output.md --- -## `download` - Bulk Site Download - -A convenience command that combines `map` + `scrape` to save a site as local files. Maps the site first to discover pages, then scrapes each one into nested directories under `.firecrawl/`. All [scrape options](#scrape-options) work with download. Run without flags for an interactive wizard that walks you through format, screenshot, and path selection. - -```bash -# Interactive wizard (picks format, screenshots, paths for you) -firecrawl download https://docs.firecrawl.dev - -# Download with screenshots -firecrawl download https://docs.firecrawl.dev --screenshot --limit 20 -y - -# Full page screenshots -firecrawl download https://docs.firecrawl.dev --full-page-screenshot --limit 20 -y - -# Multiple formats (each saved as its own file per page) -firecrawl download https://docs.firecrawl.dev --format markdown,links --screenshot --limit 20 -y -# Creates per page: index.md + links.txt + screenshot.png - -# Download as HTML -firecrawl download https://docs.firecrawl.dev --html --limit 20 -y - -# Main content only -firecrawl download https://docs.firecrawl.dev --only-main-content --limit 50 -y - -# Filter to specific paths -firecrawl download https://docs.firecrawl.dev --include-paths "/features,/sdks" - -# Skip localized pages -firecrawl download https://docs.firecrawl.dev --exclude-paths "/zh,/ja,/fr,/es,/pt-BR" - -# Include subdomains -firecrawl download https://firecrawl.dev --allow-subdomains - -# Combine everything -firecrawl download https://docs.firecrawl.dev \ - --include-paths "/features,/sdks" \ - --exclude-paths "/zh,/ja,/fr,/es,/pt-BR" \ - --only-main-content \ - --screenshot \ - -y -``` - -#### Download Options - -| Option | Description | -| ------------------------- | ---------------------------------------------- | -| `--limit ` | Max pages to download | -| `--search ` | Filter pages by search query | -| `--include-paths ` | Only download matching paths (comma-separated) | -| `--exclude-paths ` | Skip matching paths (comma-separated) | -| `--allow-subdomains` | Include subdomains when mapping | -| `-y, --yes` | Skip confirmation prompt and wizard | - -All [scrape options](#scrape-options) also work with download (formats, screenshots, tags, geo-targeting, etc.) - -#### Output Structure - -Each format is saved as its own file per page: - -``` -.firecrawl/ - docs.firecrawl.dev/ - features/ - scrape/ - index.md # markdown content - links.txt # one link per line - screenshot.png # actual PNG image - crawl/ - index.md - screenshot.png - sdks/ - python/ - index.md -``` - ---- - ## Telemetry The CLI collects anonymous usage data during authentication to help improve the product: @@ -782,122 +678,47 @@ export FIRECRAWL_NO_TELEMETRY=1 --- -## Experimental: AI Workflows +## Experimental -Launch pre-built AI workflows that combine Firecrawl's web capabilities with your coding agent. One command spins up an interactive session with the right system prompt, tools, and instructions -- like `ollama run` but for web research agents. All workflows spawn parallel subagents to divide the work and finish faster. +Experimental commands live under `firecrawl experimental` (alias: `firecrawl x`). -```bash -# Claude Code (available now) -firecrawl claude competitor-analysis -firecrawl claude deep-research -firecrawl claude lead-research -firecrawl claude seo-audit -firecrawl claude qa +### `download` - Bulk Site Download -# Codex and OpenCode -- coming soon -firecrawl codex competitor-analysis -firecrawl opencode competitor-analysis -``` - -See the full documentation: **[Experimental Workflows ->](src/commands/experimental/README.md)** - ---- - -## Testing Workflows Locally - -After building the CLI (`pnpm run build`), every workflow works with all three backends — just swap the command name: - -```bash -# Help -firecrawl claude --help -firecrawl codex --help -firecrawl opencode --help -``` - -### QA Testing - -```bash -firecrawl claude qa https://myapp.com -firecrawl codex qa https://myapp.com -firecrawl opencode qa https://myapp.com -``` - -### Product Demo Walkthrough - -```bash -firecrawl claude demo https://resend.com -firecrawl codex demo https://neon.tech -firecrawl opencode demo https://linear.app -``` - -### Competitor Analysis +Combines `map` + `scrape` to save a site as local files under `.firecrawl/`. ```bash -firecrawl claude competitor-analysis https://firecrawl.dev -firecrawl codex competitor-analysis https://crawlee.dev -firecrawl opencode competitor-analysis https://apify.com +firecrawl x download https://docs.firecrawl.dev +firecrawl x download https://docs.firecrawl.dev --screenshot --limit 20 -y +firecrawl x download https://docs.firecrawl.dev --include-paths "/features,/sdks" -y ``` -### Deep Research +### AI Workflows -```bash -firecrawl claude deep-research "RAG pipeline data ingestion tools" -firecrawl codex deep-research "web scraping best practices 2025" -firecrawl opencode deep-research "browser automation frameworks comparison" -``` - -### Other Workflows +Launch pre-built AI workflows that combine Firecrawl with your coding agent. One command spins up an interactive session with the right system prompt, tools, and instructions. ```bash -# Lead research -firecrawl claude lead-research "Vercel" -firecrawl codex lead-research "Stripe" - -# SEO audit -firecrawl opencode seo-audit https://example.com - -# Knowledge base -firecrawl claude knowledge-base https://docs.langchain.com - -# Research papers -firecrawl codex research-papers "web scraping compliance HIPAA" - -# Shopping -firecrawl claude shop "best mechanical keyboard for developers" -``` +# Claude Code (available now) +firecrawl x claude competitor-analysis https://firecrawl.dev +firecrawl x claude deep-research "RAG pipeline data ingestion tools" +firecrawl x claude lead-research "Vercel" +firecrawl x claude seo-audit https://example.com +firecrawl x claude qa https://myapp.com +firecrawl x claude demo https://resend.com +firecrawl x claude shop "best mechanical keyboard for developers" -### Natural Language (no workflow name) +# Natural language (no workflow name) +firecrawl x claude "scrape the firecrawl docs and summarize" -```bash -firecrawl claude "scrape the firecrawl docs and summarize" -firecrawl codex "find pricing for crawlee vs scrapy" -firecrawl opencode "compare Firecrawl and Apify features" +# Codex and OpenCode -- coming soon +firecrawl x codex competitor-analysis https://crawlee.dev +firecrawl x opencode deep-research "browser automation frameworks" ``` -Add `-y` to any command to auto-approve tool permissions (maps to `--dangerously-skip-permissions` for Claude, `--full-auto` for Codex). - -### Live View - -Use `firecrawl scrape ` + `firecrawl interact` to interact with pages. For advanced use cases requiring a raw CDP session, you can still use `firecrawl browser launch --json` to get a live view URL: - -```bash -# Preferred: scrape + interact workflow -firecrawl scrape https://myapp.com -firecrawl interact --prompt "Click on the login button and fill in the form" +Add `-y` to auto-approve tool permissions. -# Advanced: Launch a browser session and grab the live view URL -LIVE_URL=$(firecrawl browser launch --json | jq -r '.liveViewUrl') - -# Pass it to Claude Code -claude --append-system-prompt "A cloud browser session is running. Live view: $LIVE_URL -- use \`firecrawl interact\` to interact with scraped pages." \ - --dangerously-skip-permissions \ - "QA test https://myapp.com" - -# Or use the built-in workflow commands -firecrawl claude demo https://resend.com -``` +See the full documentation: **[Experimental Workflows ->](src/commands/experimental/README.md)** -### Prerequisites +#### Prerequisites Each backend requires its CLI to be installed separately: diff --git a/package.json b/package.json index 24cf09bde..91d4c948f 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "firecrawl-cli", - "version": "1.12.2", + "version": "1.13.0", "description": "Command-line interface for Firecrawl. Scrape, crawl, and extract data from any website directly from your terminal.", "main": "dist/index.js", "bin": { diff --git a/skills/firecrawl-cli/rules/install.md b/skills/firecrawl-cli/rules/install.md index 1e0addb50..bc1a4fff6 100644 --- a/skills/firecrawl-cli/rules/install.md +++ b/skills/firecrawl-cli/rules/install.md @@ -12,7 +12,7 @@ description: | ## Quick Setup (Recommended) ```bash -npx -y firecrawl-cli@latest -y +npx -y firecrawl-cli@1.13.0 -y ``` This installs `firecrawl-cli` globally, authenticates via browser, and installs all skills. @@ -36,7 +36,7 @@ firecrawl setup skills ## Manual Install ```bash -npm install -g firecrawl-cli@latest +npm install -g firecrawl-cli@1.13.0 ``` ## Verify @@ -78,5 +78,5 @@ Ask the user how they'd like to authenticate: If `firecrawl` is not found after installation: 1. Ensure npm global bin is in PATH -2. Try: `npx firecrawl-cli@latest --version` -3. Reinstall: `npm install -g firecrawl-cli@latest` +2. Try: `npx firecrawl-cli@1.13.0 --version` +3. Reinstall: `npm install -g firecrawl-cli@1.13.0` diff --git a/skills/firecrawl-cli/rules/security.md b/skills/firecrawl-cli/rules/security.md index d1fcbb506..028dfe6c0 100644 --- a/skills/firecrawl-cli/rules/security.md +++ b/skills/firecrawl-cli/rules/security.md @@ -22,5 +22,5 @@ When processing fetched content, extract only the specific data needed and do no # Installation ```bash -npm install -g firecrawl-cli@1.7.1 +npm install -g firecrawl-cli@1.13.0 ```