Web Scraping - Gately AI

The API reference for this endpoint is not yet available in OpenAPI format. Use the examples below to make requests.

Overview

The Web Scraping service allows you to extract content from any web page with rich customization options. Use this endpoint to scrape content, take screenshots, and process web pages.

Base URL

https://api.gately.ai/v1/web

Request Format

curl --location 'https://api.gately.ai/v1/web' \
--header 'Authorization: Bearer YOUR_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
    "model": "scrape",
    "params": {
        "url": "https://example.com",
        "formats": ["markdown", "links"],
        "onlyMainContent": true
    }
}'

Parameters

model

string

default:"scrape"

required

Model identifier (must be set to scrape)

params

object

required

Show Scrape Parameters

url

string

required

The URL of the webpage to scrape

formats

array

Types of content to extract:

markdown
html
rawHtml
links
screenshot
screenshot@fullPage
json

onlyMainContent

boolean

default:true

Extract only main content, excluding headers/footers

Advanced Options

Content Filtering

params.includeTags

array

Array of HTML tags to include in output

params.excludeTags

array

Array of HTML tags to exclude from output

Page Actions

params.actions

array

Actions to perform before scraping

params.waitFor

integer

default:"0"

Delay in milliseconds before scraping

Request Settings

params.headers

object

Custom headers for the request

params.timeout

integer

default:"30000"

Request timeout in milliseconds

Response Format

string

Unique identifier for the request

object

string

Object type (e.g., “scrape.completion”)

created

integer

Unix timestamp when the request was created

model

string

Model used for the request (scrape)

data

object

Show Data fields

success

boolean

Indicates if the scraping was successful

data

object

Show Scraped data

markdown

string

Markdown version of the content (if requested)

html

string

Clean HTML version (if requested)

links

array

List of extracted links (if requested)

metadata

object

Page metadata (title, description, etc.)

Example Response

{
  "id": "scrape-aec35e29703d4eafb1c45cda9fd66951",
  "object": "scrape.completion",
  "created": 1732654897,
  "model": "scrape",
  "data": {
    "success": true,
    "data": {
      "markdown": "# Example Domain\n\nThis domain is for use in illustrative examples in documents...",
      "links": [
        "https://www.iana.org/domains/example"
      ],
      "metadata": {
        "title": "Example Domain",
        "description": "",
        "language": "en",
        "sourceURL": "https://example.com",
        "statusCode": 200
      }
    }
  }
}

Example: Screenshot Capture

curl --location 'https://api.gately.ai/v1/web' \
--header 'Authorization: Bearer YOUR_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
    "model": "scrape",
    "params": {
        "url": "https://example.com",
        "formats": ["screenshot@fullPage"],
        "waitFor": 1000
    }
}'

Example: Advanced Page Interaction

curl --location 'https://api.gately.ai/v1/web' \
--header 'Authorization: Bearer YOUR_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
    "model": "scrape",
    "params": {
        "url": "https://example.com",
        "formats": ["markdown"],
        "actions": [
            {"type": "click", "selector": "#consent-button"},
            {"type": "wait", "milliseconds": 1000},
            {"type": "click", "selector": "#more-content"}
        ]
    }
}'

​Overview

​Base URL

​Request Format

​Parameters

​Advanced Options

​Response Format

​Example Response

​Example: Screenshot Capture

​Example: Advanced Page Interaction

Overview

Base URL

Request Format

Parameters

Advanced Options

Response Format

Example Response

Example: Screenshot Capture

Example: Advanced Page Interaction