TechDogs-"Thunderbit Launches High-Fidelity Web Data API, MCP Server, And CLI"

Artificial Intelligence

Thunderbit Launches High-Fidelity Web Data API, MCP Server, And CLI

Business Wire
Overall Rating

SAN FRANCISCO--(BUSINESS WIRE)--#AIWebScraper--Thunderbit, an AI web data platform with over 100,000 users, today launched its developer API, Model Context Protocol (MCP) server, and CLI, giving developers new ways to turn complex, long-tail websites into clean Markdown or structured data for AI agents, RAG pipelines, and automation workflows.

At the center of the launch is Thunderbit Distill, an adaptive HTML-to-Markdown engine designed for high-fidelity conversion across complex web pages. In internal HTML-to-Markdown evaluations, Distill scored 0.87 ROUGE-L and produced cleaner, more complete Markdown across product pages, pricing tables, directories, search results, reviews, and other page types, without requiring site-specific rules.

Thunderbit uses AI models rather than fixed parsing rules to identify meaningful page content, then cleans navigation, scripts, ads, and boilerplate so LLMs and databases receive less noisy input.

Thunderbit also introduced Extract, which returns structured JSON or CSV from a URL using a developer-defined schema. Together, Distill and Extract support Markdown for AI agents, RAG, knowledge bases, and content ingestion, or structured data for databases, spreadsheets, enrichment jobs, and internal tools.

"AI agents are only as useful as the web data they can actually reach," said Shuai Guan, Co-founder and CEO of Thunderbit. "We built Thunderbit to turn changing web pages into data that software can use reliably."

Traditional scraping pipelines often rely on CSS selectors, XPath, or site-specific parsing rules that can break when layouts change. Thunderbit is built to understand page semantics and adapt to changing structure, helping developers get cleaner, more complete output without maintaining custom scrapers for every site.

The launch extends Thunderbit beyond its no-code Chrome extension and web app, which are used by sales, ecommerce, research, and operations teams to extract tens of millions of pages every month. Developers can now bring the same adaptive extraction engine into AI applications, automated workflows, and internal systems.

Thunderbit's developer API, MCP server, CLI, and documentation are available today at https://thunderbit.com/docs. Free credits are available for new users.

About Thunderbit

Thunderbit is an AI web data platform used by over 100,000 users to extract structured data from websites, PDFs, and images. Products include a no-code Chrome extension and developer tools for AI workflows, web data extraction, and automation. Learn more at https://thunderbit.com.


Contacts

Media Contact Richard Li
Thunderbit
xin.li@thunderbit.com

Frequently Asked Questions

What new tools has Thunderbit launched for developers?

Thunderbit has launched its developer API, Model Context Protocol (MCP) server, and CLI, enabling developers to convert complex websites into clean Markdown or structured data for AI agents, RAG pipelines, and automation workflows.

How does Thunderbit Distill improve web data extraction?

Thunderbit Distill is an adaptive HTML-to-Markdown engine that uses AI models to identify meaningful content and clean navigation, scripts, ads, and boilerplate, producing cleaner, more complete Markdown without site-specific rules.

What types of data can Thunderbit extract?

Thunderbit can extract data as clean Markdown for AI agents, RAG, knowledge bases, and content ingestion, or as structured JSON or CSV for databases, spreadsheets, enrichment jobs, and internal tools.

First published on Mon, May 25, 2026

Liked what you read? That’s only the tip of the tech iceberg!

Explore our vast collection of tech articles including introductory guides, product reviews, trends and more, stay up to date with the latest news, relish thought-provoking interviews and the hottest AI blogs, and tickle your funny bone with hilarious tech memes!

Plus, get access to branded insights from industry-leading global brands through informative white papers, engaging case studies, in-depth reports, enlightening videos and exciting events and webinars.

Dive into TechDogs' treasure trove today and Know Your World of technology like never before!

Disclaimer - Reference to any specific product, software or entity does not constitute an endorsement or recommendation by TechDogs nor should any data or content published be relied upon. The views expressed by TechDogs' members and guests are their own and their appearance on our site does not imply an endorsement of them or any entity they represent. Views and opinions expressed by TechDogs' Authors are those of the Authors and do not necessarily reflect the view of TechDogs or any of its officials. While we aim to provide valuable and helpful information, some content on TechDogs' site may not have been thoroughly reviewed for every detail or aspect. We encourage users to verify any information independently where necessary.

Loading comments...

  • Dark
  • Light