Productionv2.0.0

WebScraper Toolkit

Complete web scraping suite: Chrome extension, backend with REST API, headless scraper, MCP server for AI agents, and dashboard with sitemap visualization. Captures DOM, styles, assets, network requests and metadata from any site.

TypeScriptNode.jsExpressSQLitePuppeteerChrome Extension (Manifest V3)D3.jsMCP Protocol

Interactive Demo

Interact with a live version of the product. No screenshots — real code.

Web Scrape Toolkit
Web Scrape Toolkit
v2.0.0
Storage
Send to backend server
Session
competitor-analysis• example.com
Scrape Mode
21
endpoints
11
mcpTools
162
tests
11
dbTables

Features

30 features included

Core Features

  • Full page capture

    DOM, computed styles, images, scripts and stylesheets

  • Selective scraping

    Visual element picker, CSS selector rules, per-page and full-site modes

  • Multi-page crawling

    BFS engine with configurable depth, page limit and inter-request delay

  • Network capture

    XHR/fetch interception with request body capture (10KB)

  • Auto-capture on navigation

    Automatically capture every page during manual browsing

  • URL filtering

    Glob/regex patterns to include or exclude URLs during crawl

  • Pagination engine

    Auto-detect next buttons, load-more, numbered pagination and infinite scroll

  • Sitemap & JS route discovery

    Auto-fetch sitemap.xml, JS code route detection, manual seed URLs

  • Full-text search

    Cross-page search with highlighted context snippets

Stealth & Anti-detection

  • Stealth mode

    15+ User-Agent rotation, viewport jitter, 12+ Chrome anti-detection flags

  • Rate-limiting compliance

    Respects Retry-After headers, auto-backoff on HTTP 429, randomized 5-12s delays

Automation

  • Crawl presets

    Quick Scan, Full Site, Stealth and Blitz — preconfigured settings for each scenario

  • Scheduled scraping

    Recurring jobs with cron expressions via REST API

  • Batch scraping

    Multiple URLs in a single job, async queue with concurrent limits

Security

  • Login session handling

    Automatic logout detection, crawl pause for re-authentication

  • Auth profiles

    Cookies and headers encrypted with AES-256-GCM per domain

Data Intelligence

  • Contact extraction

    Emails, phones, social profiles (20+ platforms), physical addresses and contact forms

  • SEO analysis

    Per-page and site-wide scoring, 12 categories, actionable recommendations

  • Security scanning

    XSS, mixed content, insecure forms, exposed keys, CVE correlation

  • Technology stack detection

    Identifies frameworks, CMS, libraries and detected versions per page

  • Intelligence dashboard

    KPIs, severity distribution, findings table, interactive navigation

  • Site map visualization

    Force-directed graph (D3.js) showing page link structure

Monitoring

  • Change detection

    SHA-256 hashing, line-by-line diffs, snapshot history

  • RSS/Atom feeds

    Automatic feed discovery and structured item polling

Integrations

  • YouTube integration

    Channel video listing, details and description link extraction

  • REST API

    21 endpoints with Bearer token auth, OpenAPI 3.0 spec, async jobs and batch

  • MCP server

    11 tools for AI agents (Claude, etc.) via MCP protocol

Export

  • WST Explorer

    Interactive data viewer with dark mode, keyboard shortcuts and multiple views

  • PDF executive summary

    Professional report with KPIs, findings, contacts, SEO and recommendations

  • Multi-format export

    .wst.json (AI), CSV, JSON, self-contained static site with URL rewriting

Apply for Diagnostic