crawlio-js is a Node.js SDK for interacting with the Crawlio web scraping and crawling API. It provides programmatic access to scraping, crawling, and batch processing endpoints with built-in error handling.
npm install crawlio.jsimport { Crawlio } from 'crawlio.js'
const client = new Crawlio({ apiKey: 'your-api-key' })
const result = await client.scrape({ url: 'https://example.com' })
console.log(result.html)Creates a new Crawlio client.
Options:
| Name | Type | Required | Description | 
|---|---|---|---|
| apiKey | string | ✅ | Your Crawlio API key | 
| baseUrl | string | ❌ | API base URL (default: https://crawlio.xyz) | 
Scrapes a single page.
await client.scrape({ url: 'https://example.com' })ScrapeOptions:
| Name | Type | Required | Description | 
|---|---|---|---|
| url | string | ✅ | Target URL | 
| exclude | string[] | ✅ | CSS selectors to exclude | 
| includeOnly | string[] | ❌ | CSS selectors to include | 
| markdown | boolean | ❌ | Convert HTML to Markdown | 
| returnUrls | boolean | ❌ | Return all discovered URLs | 
| workflow | Workflow[] | ❌ | Custom workflow steps to execute | 
| normalizeBase64 | boolean | ❌ | Normalize base64 content | 
| cookies | CookiesInfo[] | ❌ | Cookies to include in the request | 
| userAgent | string | ❌ | Custom User-Agent header for the request | 
Initiates a site-wide crawl.
CrawlOptions:
| Name | Type | Required | Description | 
|---|---|---|---|
| url | string | ✅ | Root URL to crawl | 
| count | number | ✅ | Number of pages to crawl | 
| sameSite | boolean | ❌ | Limit crawl to same domain | 
| patterns | string[] | ❌ | URL patterns to match | 
| exclude | string[] | ❌ | CSS selectors to exclude | 
| includeOnly | string[] | ❌ | CSS selectors to include | 
Checks the status of a crawl job.
Gets results from a completed crawl.
Performs a search on scraped content.
SearchOptions:
| Name | Type | Description | 
|---|---|---|
| site | string | Limit search to a specific domain | 
Initiates scraping for multiple URLs in one request.
BatchScrapeOptions:
| Name | Type | Description | 
|---|---|---|
| url | string[] | List of URLs | 
| options | Omit<ScrapeOptions, 'url'> | Common options for all URLs | 
Checks the status of a batch scrape job.
Fetches results from a completed batch scrape.
All Crawlio errors extend from CrawlioError. You can catch and inspect these for more context.
- CrawlioError
- CrawlioRateLimit
- CrawlioLimitExceeded
- CrawlioAuthenticationError
- CrawlioInternalServerError
- CrawlioFailureError
{
  jobId: string
  html: string
  markdown: string
  meta: Record<string, string>
  urls?: string[]
  url: string
}{
  id: string
  status: 'IN_QUEUE' | 'RUNNING' | 'LIMIT_EXCEEDED' | 'ERROR' | 'SUCCESS'
  error: number
  success: number
  total: number
}{
  name: string
  value: string
  path: string
  expires?: number
  httpOnly: boolean
  secure: boolean
  domain: string
  sameSite: 'Strict' | 'Lax' | 'None'
}