Skip to content

firecrawl/firecrawl-java-sdk

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Firecrawl Java SDK

A Java client library for the Firecrawl API, providing web crawling, scraping, and search capabilities.

Requirements

  • Java 17 or higher
  • Maven 3.8+ (for building from source)

Installation

Maven/Gradle (v2 via JitPack)

v2 of this SDK is distributed via JitPack. Add the JitPack repository and use the JitPack coordinates for this repo.

Maven (pom.xml):

<repositories>
    <repository>
        <id>jitpack.io</id>
        <url>https://jitpack.io</url>
    </repository>
</repositories>

<dependency>
    <groupId>com.github.firecrawl</groupId>
    <artifactId>firecrawl-java-sdk</artifactId>
    <version>2.0</version>
</dependency>

Gradle (Groovy DSL; Kotlin DSL similar):

repositories {
    maven { url = uri('https://jitpack.io') }
}

dependencies {
    implementation 'com.github.firecrawl:firecrawl-java-sdk:2.0'
}

Legacy: v1 via JitPack

If you still need the legacy v1 package via JitPack, use the coordinates below and include the JitPack repository.

Maven:

<repositories>
    <repository>
        <id>jitpack.io</id>
        <url>https://jitpack.io</url>
    </repository>
</repositories>

<dependency>
    <groupId>com.github.mendableai</groupId>
    <artifactId>firecrawl-java-sdk</artifactId>
    <version>0.8</version>
</dependency>

Gradle:

repositories {
    maven { url = uri('https://jitpack.io') }
}

dependencies {
    implementation 'com.github.mendableai:firecrawl-java-sdk:0.8'
}

Building from Source

To build and install the SDK locally:

git clone https://github.com/firecrawl/firecrawl-java-sdk.git
cd firecrawl-java-sdk
mvn install

Usage

Creating a Client

import dev.firecrawl.client.FirecrawlClient;
import dev.firecrawl.model.*;
import java.time.Duration;

// Create a client with default endpoint
FirecrawlClient client = new FirecrawlClient(
    "your-api-key",
    null,  // Uses default endpoint: https://api.firecrawl.dev
    Duration.ofSeconds(60)  // Request timeout
);

// Or specify a custom endpoint
FirecrawlClient client2 = new FirecrawlClient(
    "your-api-key",
    "https://custom-api-endpoint.example.com",
    Duration.ofSeconds(120)
);

// You can also set the API key via the FIRECRAWL_API_KEY environment variable
// Optionally set FIRECRAWL_API_URL to override the default endpoint
// Pass null timeout to use the default of 120 seconds
FirecrawlClient client3 = new FirecrawlClient(
    null,  // Will use FIRECRAWL_API_KEY environment variable
    null,  // Will use FIRECRAWL_API_URL if set, otherwise https://api.firecrawl.dev
    null   // Default timeout is 120 seconds when null
);

Web Scraping

// Simple scraping
FirecrawlDocument doc = client.scrapeURL("https://example.com", null);
System.out.println(doc.getHtml());
System.out.println(doc.getText());

// Advanced scraping with options
ScrapeParams params = new ScrapeParams();
params.setOnlyMainContent(true);  // Extract only main content
params.setWaitFor(5000);          // Wait 5 seconds after page load
FirecrawlDocument doc2 = client.scrapeURL("https://example.com", params);

Search

Note: In v2, sources currently only supports "web". The SDK enforces this by normalizing sources to ["web"] when provided.

// Basic search
SearchParams params = new SearchParams("open source java sdk");
params.setLimit(10);
params.setLang("en");
SearchResponse resp = client.search(params);

// Process results
if (resp.isSuccess()) {
    for (SearchResult result : resp.getResults()) {
        System.out.println(result.getTitle() + " - " + result.getUrl());
    }
}

// Check for warnings
if (resp.getWarning() != null) {
    System.err.println("Warning: " + resp.getWarning());
}

Web Crawling

// Asynchronous crawling
String idempotencyKey = java.util.UUID.randomUUID().toString();
CrawlParams params = new CrawlParams();
CrawlResponse resp = client.asyncCrawlURL("https://example.com", params, idempotencyKey);
String jobId = resp.getId();

// Check crawl status
CrawlStatusResponse status = client.checkCrawlStatus(jobId);
System.out.println("Crawl status: " + status.getStatus());

// Synchronous crawling (with polling)
CrawlStatusResponse result = client.crawlURL("https://example.com", params, idempotencyKey, 5);
if ("completed".equals(result.getStatus())) {
    FirecrawlDocument[] documents = result.getData();
    // Process crawled documents
}

// Cancel a crawl job
CancelCrawlJobResponse cancelResp = client.cancelCrawlJob(jobId);

URL Mapping

MapParams params = new MapParams();
MapResponse resp = client.mapURL("https://example.com", params);
if (resp.isSuccess()) {
String[] links = resp.getLinks();
// Process links
}

API Documentation

For detailed API documentation, visit https://firecrawl.dev/docs.

License

This SDK is available under the MIT License. See the LICENSE file for more information.