Scrape

import { Supadata } from '@supadata/js';

const supadata = new Supadata({
  apiKey: 'YOUR_API_KEY',
});
const webContent = await supadata.web.scrape('https://example.com');
console.log(webContent);

{
  "url": "https://supadata.ai",
  "content": "# Supadata\n## What is Supadata?\nSupadata is an API platform for data extraction for LLM training.",
  "name": "Supadata: Web & YouTube to text API for developers",
  "description": "Supadata is one stop-shop API for developers to read web and YouTube content, ready for AI training and retrieval.",
  "ogUrl": "https://supadata.ai/opengraph-image.png",
  "countCharacters": 12300,
  "urls": [
    "https://supadata.ai",
    "https://supadata.ai/documentation"
  ]
}

GET

web

scrape

import { Supadata } from '@supadata/js';

const supadata = new Supadata({
  apiKey: 'YOUR_API_KEY',
});
const webContent = await supadata.web.scrape('https://example.com');
console.log(webContent);

{
  "url": "https://supadata.ai",
  "content": "# Supadata\n## What is Supadata?\nSupadata is an API platform for data extraction for LLM training.",
  "name": "Supadata: Web & YouTube to text API for developers",
  "description": "Supadata is one stop-shop API for developers to read web and YouTube content, ready for AI training and retrieval.",
  "ogUrl": "https://supadata.ai/opengraph-image.png",
  "countCharacters": 12300,
  "urls": [
    "https://supadata.ai",
    "https://supadata.ai/documentation"
  ]
}

Authorizations

x-api-key

string

header

required

Query Parameters

url

string

required

URL of the webpage

Example:

"https://supadata.ai"

noLinks

boolean

default:false

When true, removes markdown links from the content, leaving only the URL text.

lang

string

default:en

Preferred language for the scraped content (ISO 639-1 code). Sets Accept-Language header to influence website language selection.

Example:

"en"

Response

Successfully fetched web page content

url

string

required

The URL that was scraped

Example:

"https://supadata.ai"

content

string

required

The Markdown content extracted from the URL

Example:

"# Supadata\n## What is Supadata?\nSupadata is an API platform for data extraction for LLM training."

countCharacters

number

required

The number of characters in the content

Example:

12300

urls

string[]

required

List of URLs found on the webpage

Example:

[
  "https://supadata.ai",
  "https://supadata.ai/documentation"
]

name

string

The name of the webpage

Example:

"Supadata: Web & YouTube to text API for developers"

description

string

A description of the webpage

Example:

"Supadata is one stop-shop API for developers to read web and YouTube content, ready for AI training and retrieval."

ogUrl

string

Open Graph URL for the webpage

Example:

"https://supadata.ai/opengraph-image.png"

Batch Result Map

⌘I

Using the API

Transcript Endpoints

YouTube Endpoints

Web Endpoints

Authorizations

Query Parameters

Response