Skip to content

Latest commit

 

History

History
138 lines (99 loc) · 5.27 KB

README.md

File metadata and controls

138 lines (99 loc) · 5.27 KB

How to Solve Cloudflare with Puppeteer

bd5e19839e4f1cdce486977ccbd5b988

Step-by-Step Guide to Using Puppeteer to Solve Cloudflare

Step 1: Setting Up Puppeteer

Puppeteer is a Node.js library that provides a high-level API for controlling Chrome or Chromium. It is commonly used for automation, testing, and web scraping.

First, install Puppeteer using npm:

npm install puppeteer

Next, you can create a script to launch a browser and navigate to a Cloudflare-protected site:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch({ headless: false });
  const page = await browser.newPage();
  await page.goto('https://example.com'); // Replace with your target URL
  await page.screenshot({ path: 'before-cf.png' });

  // Additional steps to handle Cloudflare's protections will follow

  await browser.close();
})();

This script launches a browser, navigates to a URL, and captures a screenshot. However, visiting a Cloudflare-protected site may trigger security checks. You will need additional steps to handle these.

Step 2: Handling Cloudflare’s JavaScript Challenges

Cloudflare often uses JavaScript challenges to ensure the request comes from a legitimate browser. These challenges typically take a few seconds to complete, and Puppeteer can handle them by waiting for the scripts to execute:

await page.waitForTimeout(10000); // Wait 10 seconds for Cloudflare's verification
await page.screenshot({ path: 'after-cf.png' });

While this works for basic challenges, advanced protections like CAPTCHAs require a more powerful solution. This is where CapSolver comes in.

CapSolver Integration: Enhancing Puppeteer to Bypass Cloudflare

CapSolver is a service that automatically solves CAPTCHAs and other challenges, making it ideal for bypassing advanced Cloudflare defenses. By integrating CapSolver with Puppeteer, you can automate CAPTCHA resolution and continue running your scripts without interruption.

Here’s how to integrate CapSolver with Puppeteer:

const puppeteer = require('puppeteer');
const axios = require('axios');

const clientKey = 'your-client-key-here'; // Replace with your CapSolver client key
const websiteURL = 'https://example.com'; // Replace with your target website URL
const websiteKey = 'your-website-key-here'; // Replace with the website key from CapSolver

async function createTask() {
  const response = await axios.post('https://api.capsolver.com/createTask', {
    clientKey: clientKey,
    task: {
      type: "AntiTurnstileTaskProxyLess",
      websiteURL: websiteURL,
      websiteKey: websiteKey
    }
  }, {
    headers: {
      'Content-Type': 'application/json',
      'Pragma': 'no-cache'
    }
  });

  return response.data.taskId;
}

async function getTaskResult(taskId) {
  let response;

  while (true) {
    response = await axios.post('https://api.capsolver.com/getTaskResult', {
      clientKey: clientKey,
      taskId: taskId
    }, {
      headers: {
        'Content-Type': 'application/json'
      }
    });

    if (response.data.status === 'ready') {
      return response.data.solution;
    }

    console.log('Status not ready, checking again in 5 seconds...');
    await new Promise(resolve => setTimeout(resolve, 5000));
  }
}

(async () => {
  const taskId = await createTask();
  const result = await getTaskResult(taskId);
  let solution = result.token;

  const browser = await puppeteer.launch({ headless: false });
  const page = await browser.newPage();
  await page.goto(websiteURL);
  await page.waitForSelector('input[name="cf-turnstile-response"]');
  await page.evaluate(solution => {
    document.querySelector('input[name="cf-turnstile-response"]').value = solution;
  }, solution);
  await page.screenshot({ path: 'example.png' });
})();

How It Works:

  • createTask(): Sends a request to CapSolver to solve the CAPTCHA on the website.
  • getTaskResult(): Continuously checks the status of the task until CapSolver provides a solution.
  • The Puppeteer script then uses this solution to bypass the CAPTCHA and continue the automation.

With CapSolver, Puppeteer can bypass Cloudflare’s advanced protections and allow your script to continue seamlessly.

Conclusion

Navigating Cloudflare's security measures is a common challenge for developers working on automation and web scraping. While Puppeteer can handle basic challenges, integrating CapSolver enables you to bypass advanced defenses like CAPTCHAs.

If you're looking to streamline your automation tasks, get started with CapSolver today and take advantage of our services using the bonus code WEBS for extra value.


Key Improvements:

  • Added hyperlinks for CapSolver, Cloudflare, and Puppeteer for better navigation.
  • Provided clear instructions and code comments.
  • Adapted the format to GitHub with clear section titles and code blocks.