Crawl API

Automate content extraction from any domain. Simply define the root URL and retrieve the full website content as Markdown, Text, HTML, or JSON files.

Aucune carte de crédit requise
  • Map entire site structures in one request
  • Capture both static and dynamic web content
  • Flexible for SEO, AI, and compliance needs
  • Integrates with popular dev frameworks and no-code
TRUSTED BY 20,000+ CUSTOMERS WORLDWIDE

                              const options = {
  method: 'POST',
  headers: {Authorization: 'Bearer ', 'Content-Type': 'application/json'},
  body: '[{"url":"https://il.linkedin.com/company/bright-data"}]'
};

fetch('https://api.brightdata.com/datasets/v3/trigger', options)
  .then(response => response.json())
  .then(response => console.log(response))
  .catch(err => console.error(err));






                              
                            
                              import requests

url = "https://api.brightdata.com/datasets/v3/trigger"

payload = [{"url": "https://il.linkedin.com/company/bright-data"}]
headers = {
    "Authorization": "Bearer ",
    "Content-Type": "application/json"
}

response = requests.request("POST", url, json=payload, headers=headers)

print(response.text)
                              
                            

Easy to start, easier to scale

  1. Choose target domain
    Define target URL and connect to the API with a single line of code
  2. Send request
    Edit crawl parameters and insert your custom logic using Python or JavaScript
  3. Get your data
    Retrieve website data as Markdown, Text, HTML, or JSON files
Read documentation

Developer-first experience

Quick Start

Connect to the Crawl API with a single line of code, or use the control panel to get results directly through the Control Panel.

Custom Collection

Use request parameters to customize collection and delivery, including pagination, scheduling and log collection.

Data Parsing

Efficiently converts raw HTML into structured data files, delivered as Markdown, Text, HTML, or JSON, directly to your database.

Crawl API pricing

pay as you go plan icon
Pay as you go
$1.5 / 1K ENREGISTREMENTS
Sans engagement
Essai gratuit

Paiement à l'utilisation sans engagement mensuel
25% OFF
2nd plan icon
Croissance
$1.3
$0.98 / 1K ENREGISTREMENTS
$499 Facturation mensuelle
Essai gratuit
Use this coupon code: APIS25

Conçu pour les équipes cherchant à développer leurs opérations
25% OFF
3rd plan icon
ACTIVITÉ
$1.1
$0.83 / 1K ENREGISTREMENTS
$999 Facturation mensuelle
Essai gratuit
Use this coupon code: APIS25

Conçu pour les grandes équipes ayant des besoins opérationnels étendus
25% OFF
4th plan icon
Premium
$1
$0.75 / 1K ENREGISTREMENTS
$1999 Facturation mensuelle
Essai gratuit
Use this coupon code: APIS25

Support avancé et fonctionnalités pour les opérations critiques
ENTERPRISE
Services de données d'élite pour des exigences commerciales haut de gamme.
CONTACTEZ-NOUS
  • Responsable de compte
  • Forfaits personnalisés
  • Accord de service Premium
  • Support prioritaire
  • Accueil personnalisé
  • SSO
  • Personnalisations
  • Journaux d'audit
compliance badges

Leading the way in ethical web data collection

Bright Data sets the gold standard in compliance, effectively self-regulating the industry. With transparent operations validated by top security firms, clear peer consent, and pioneering compliance units, we ensure legitimate and safe data collection. Upholding international privacy laws and utilizing tools like BrightBot, we minimize your legal exposure, making partnership with us a strategic move to curtail legal risks and associated costs.

Start free trial

Every 15 minutes, our customers scrape enough data to train ChatGPT from scratch.

API for Seamless Crawl Data Access

Comprehensive, Scalable, and Compliant Crawl Data Extraction

FLEXIBLE

Tailored to your workflow

Get structured data in JSON, NDJSON, or CSV files through Webhook or API delivery.

SCALABLE

Built-in infrastructure and unblocking

Get maximum control and flexibility without maintaining proxy and unblocking infrastructure. Easily scrape data from any geo-location while avoiding CAPTCHAs and blocks.

STABLE

Battle-proven infrastructure

Bright Data’s platform powers over 20,000+ companies worldwide, offering peace of mind with 99.99% uptime, access to 150M+ real user IPs covering 195 countries.

COMPLIANT

Industry leading compliance

Our privacy practices comply with data protection laws, including the EU data protection regulatory framework, GDPR, and CCPA – respecting requests to exercise privacy rights and more.

Vous voulez en savoir plus ?

Contactez un de nos experts pour discuter de vos besoins en matière de web scraping

Crawl API FAQs

Bright Data’s Crawl API is a tool that lets you extract, map, and transform content from any website into structured data in formats like HTML, Markdown, and JSON, making it easy to use for AI training, SEO, compliance audits, and more.

You can crawl any public website, extracting both static and dynamic content such as articles, product listings, reviews, and complete site structures from any domain worldwide.

Crawl API delivers results in multiple formats, including Markdown, HTML, plain text, and structured schemas like ld_json. Choose the format that best fits your workflow.

Simply send an HTTP POST request to the API with your target URLs and preferred output format. You’ll receive a snapshot_id, which you can use to fetch the collected data once it's ready.

Yes! Use the no-code option in the Bright Data Control Panel. Just enter your URLs, select an output format, and start crawling with no coding required.

Results can be delivered via webhook, downloaded through the API or Control Panel, or sent to your preferred external storage (such as AWS S3, Google Cloud Storage, etc.).

Yes, the Crawl API supports scheduling, so you can automate crawls daily, weekly, or on a custom timetable to keep your datasets up to date.

Absolutely! The API integrates seamlessly with Python, Node.js, BeautifulSoup, Cheerio, and many other popular libraries for developer flexibility.

Customers use the Crawl API for LLM training dataset creation, SEO site audits, competitive research, compliance/accessibility checks, and website content migration and archiving.

You can include detailed error logs via the include_errors parameter for every crawl. Troubleshoot issues efficiently, or reach out to Bright Data support for further assistance.