Web Unlocker API

Web Unlocker は強力なスクレイピング API であり、高度なボット保護を回避しながらあらゆるWebサイトへアクセスできます。複雑なアンチボット基盤を管理することなく、1回のAPI呼び出しでクリーンなHTML/JSONレスポンスを取得できます。

API Endpoint: https://api.brightdata.com/request
Authorization Header: Web Unlocker API zone の API token
Payload:
- zone: Web Unlocker API zone 名
- url: アクセス対象のターゲットURL
- format: レスポンス形式（サイトのレスポンスを直接返すには raw を使用します）

Example: Python Script

import requests

API_URL = "https://api.brightdata.com/request"
API_TOKEN = "INSERT_YOUR_API_TOKEN"
ZONE_NAME = "INSERT_YOUR_WEB_UNLOCKER_ZONE_NAME"
TARGET_URL = "http://lumtest.com/myip.json"

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {API_TOKEN}"
}

payload = {
    "zone": ZONE_NAME,
    "url": TARGET_URL,
    "format": "raw"
}

response = requests.post(API_URL, headers=headers, json=payload)

if response.status_code == 200:
    print("Success:", response.text)
else:
    print(f"Error {response.status_code}: {response.text}")

Native Proxy-based Access

プロキシベースのルーティングを使用する代替方法です。

Example: cURL Command

curl "http://lumtest.com/myip.json" \
--proxy "brd.superproxy.io:33335" \
--proxy-user "brd-customer-<CUSTOMER_ID>-zone-<ZONE_NAME>:<ZONE_PASSWORD>"

必要な認証情報:

Customer ID: Account settings にあります
Web Unlocker API zone 名: overview タブにあります
Web Unlocker API password: overview タブにあります

Example: Python Script

import requests

customer_id = "<customer_id>"
zone_name = "<zone_name>"
zone_password = "<zone_password>"

host = "brd.superproxy.io"
port = 33335
proxy_url = f"http://brd-customer-{customer_id}-zone-{zone_name}:{zone_password}@{host}:{port}"

proxies = {"http": proxy_url, "https": proxy_url}

response = requests.get("http://lumtest.com/myip.json", proxies=proxies)

if response.status_code == 200:
    print(response.json())
else:
    print(f"Error: {response.status_code}")

Practical Example: Scraping G2 Reviews

Cloudflare によって強固に保護されているサイトである G2.com から、レビューをスクレイピングする方法を見ていきます。

Basic Request (Without Web Unlocker)

シンプルなPythonスクリプトを使用して G2 reviews をスクレイピングします:

import requests
from bs4 import BeautifulSoup

url = 'https://www.g2.com/products/mongodb/reviews'
response = requests.get(url)

if response.status_code == 200:
    soup = BeautifulSoup(response.text, "lxml")
    headings = soup.find_all('h2')
    
    if headings:
        print("\nHeadings Found:")
        for heading in headings:
            print(f"- {heading.get_text(strip=True)}")
    else:
        print("No headings found")
else:
    print("Request blocked")

Result: Cloudflare のアンチボット対策によりスクリプトは失敗（403 エラー）します。

Enhanced Request (With Web Unlocker)

このような制限を回避するには、Web Unlocker を使用します。以下はPythonによる実装です:

Direct API Access

import requests
from bs4 import BeautifulSoup

API_URL = "https://api.brightdata.com/request"
API_TOKEN = "INSERT_YOUR_API_TOKEN"
ZONE_NAME = "INSERT_YOUR_ZONE"
TARGET_URL = "https://www.g2.com/products/mongodb/reviews"

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {API_TOKEN}"
}
payload = {"zone": ZONE_NAME, "url": TARGET_URL, "format": "raw"}

response = requests.post(API_URL, headers=headers, json=payload)

if response.status_code == 200:
    soup = BeautifulSoup(response.text, "lxml")
    headings = [h.get_text(strip=True) for h in soup.find_all('h2')]
    print("\nExtracted Headings:", headings)
else:
    print(f"Error {response.status_code}: {response.text}")

Result: 保護を正常に回避し、ステータス 200 でコンテンツを取得します。

Proxy-Based Access

代替として、プロキシベースの方法を使用します:

import requests
from bs4 import BeautifulSoup

proxy_url = "http://brd-customer-<customer_id>-zone-<zone_name>:<zone_password>@brd.superproxy.io:33335"
proxies = {"http": proxy_url, "https": proxy_url}

url = "https://www.g2.com/products/mongodb/reviews"
response = requests.get(url, proxies=proxies, verify=False)

if response.status_code == 200:
    soup = BeautifulSoup(response.text, "lxml")
    headings = [h.get_text(strip=True) for h in soup.find_all('h2')]
    print("\nExtracted Headings:", headings)
else:
    print(f"Error {response.status_code}: {response.text}")

Note: 以下を追加してSSL証明書の警告を抑制します:

from requests.packages.urllib3.exceptions import InsecureRequestWarning
requests.packages.urllib3.disable_warnings(InsecureRequestWarning)

Waiting for Specific Elements

x-unblock-expect ヘッダーを使用して、特定の要素またはテキストを待機します:

headers["x-unblock-expect"] = '{"element": ".star-wrapper__desc"}'
# or
headers["x-unblock-expect"] = '{"text": "reviews"}'

👉 完全なコードは g2_wait.py で確認できます

Mobile User-Agent Targeting

デスクトップではなくモバイルのuser agentを使用するには、username に -ua-mobile を付与します:

username = f"brd-customer-{customer_id}-zone-{zone_name}-ua-mobile"

👉 完全なコードは g2_mobile.py で確認できます

Geolocation Targeting

Web Unlocker は最適なIPロケーションを自動選択しますが、ターゲットロケーションを指定することもできます:

username = f"brd-customer-{customer_id}-zone-{zone_name}-country-us"
username = f"brd-customer-{customer_id}-zone-{zone_name}-country-us-city-sanfrancisco"

👉 詳細は here で確認できます。

Debugging Requests

-debug-full フラグを追加して詳細なデバッグ情報を有効化します:

username = f"brd-customer-{customer_id}-zone-{zone_name}-debug-full"

👉 完全なコードは g2_debug.py で確認できます

Success Rate Statistics

特定ドメインのAPI成功率を監視します:

import requests

API_TOKEN = "INSERT_YOUR_API_TOKEN"

def get_success_rate(domain):
    url = f"https://api.brightdata.com/unblocker/success_rate/{domain}"
    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Bearer {API_TOKEN}"
    }
    response = requests.get(url, headers=headers)
    print(response.json() if response.status_code == 200 else response.text)

get_success_rate("g2.com") # Get statistics for specific domain
get_success_rate("g2.*") # Get statistics for all top-level domains

Final Notes

Web Unlocker を使用すると、最も強固に保護されたWebサイトであっても簡単にスクレイピングできます。以下の重要ポイントを覚えておいてください:

Not Compatible With:
- ブラウザ（Chrome, Firefox, Edge）
- アンチ検知ブラウザ（Adspower, Multilogin）
- 自動化ツール（Puppeteer, Playwright, Selenium）
Use Scraping Browser:
ブラウザベースの自動化には、Bright Data の Scraping Browser を使用してください。
Premium Domains:
premium domain 機能で難易度の高いサイトへアクセスできます。
CAPTCHA Solving:
自動的に解決されますが、disabled にできます。Bright Data の CAPTCHA Solver についても詳しく確認してください。
Custom Headers & Cookies:
狙ったサイトバージョンを対象にするために、独自のヘッダーとCookieを送信できます。 Learn more。

詳細は official documentation を参照してください。

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
src		src
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Web Unlocker API

Table of Contents

Features

Getting Started

Direct API Access

Native Proxy-based Access

Practical Example: Scraping G2 Reviews

Basic Request (Without Web Unlocker)

Enhanced Request (With Web Unlocker)

Direct API Access

Proxy-Based Access

Waiting for Specific Elements

Mobile User-Agent Targeting

Geolocation Targeting

Debugging Requests

Success Rate Statistics

Final Notes

About

Uh oh!

Releases

Packages

Languages

bright-jp/web-unlocker-api

Folders and files

Latest commit

History

Repository files navigation

Web Unlocker API

Table of Contents

Features

Getting Started

Direct API Access

Native Proxy-based Access

Practical Example: Scraping G2 Reviews

Basic Request (Without Web Unlocker)

Enhanced Request (With Web Unlocker)

Direct API Access

Proxy-Based Access

Waiting for Specific Elements

Mobile User-Agent Targeting

Geolocation Targeting

Debugging Requests

Success Rate Statistics

Final Notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages