Yandex Search Scraper

このリポジトリでは、Yandex 検索エンジン結果ページ（SERP）からデータを抽出するための、信頼性の高い2つのソリューションを提供しています。

無料の Yandex Scraper: 小規模で Yandex 検索結果をスクレイピングするための基本ツールです
エンタープライズグレードの Yandex SERP API: 大量・リアルタイムのデータ抽出に対応した、スケーラブルで本番運用可能なソリューションです（Bright Data の SERP Scraper API の一部です）

Free Yandex SERP Scraper

無料のスクレイパーは、小規模で Yandex SERP データを収集するための分かりやすい方法を提供します。個人プロジェクト、研究、またはテスト目的で限定的なデータが必要な開発者に最適です。

Setup Requirements

Python 3.9+
必要なパッケージ:
- ブラウザ自動化のための playwright
- HTML 解析のための BeautifulSoup

pip install playwright beautifulsoup4
playwright install

Webスクレイピングが初めてですか？ こちらの Beginner's Guide to Web Scraping with Python をご覧ください

Quick Start Guide

yandex-search-results-scraper.py を開きます
検索語とページ数の変数をカスタマイズします:

PAGES_PER_TERM = {
    "ergonomic office chair": 2,
}

スクリプトを実行します

Sample Output

Limitations

Yandex をスクレイピングする際の最大の課題の1つは、攻撃的な CAPTCHA 保護です:

Yandex は、自動データ抽出を防ぐために、厳格かつ継続的に進化するアンチボットシステムを使用しています。CAPTCHA が頻繁に発生すると、すぐに IP ブロックにつながり、安定して長時間稼働するスクレイパーの維持が困難になります。

無料スクレイパーは基本的なタスクには対応しますが、いくつか重要な制限があります:

IP ブロックの高いリスク
リクエスト量の制限
CAPTCHA による継続的な中断
本番環境には不適

スケーラブルで安定したソリューションとして、以下で詳述する Bright Data の専用 API をご検討ください。 👇

Yandex SERP Scraper API

Yandex Search API は、Bright Data の SERP Scraping API スイートの一部です。業界をリードする proxy infrastructure を活用し、単一の API 呼び出しでリアルタイムの Yandex 検索結果を提供します。

Key Benefits

グローバルな精度: 世界中の特定の場所向けに最適化された結果を取得できます
Pay-Per-Success: 成功したリクエストに対してのみ課金されます
リアルタイムデータ: 最新の検索結果に数秒でアクセスできます
無制限のスケーラビリティ: 大量のスクレイピングを容易に処理できます
コスト効率: 高額なインフラが不要になります
信頼性の高いパフォーマンス: アンチブロッキング技術を内蔵しています
24/7 の専門サポート: 必要なときにいつでも技術支援を利用できます

📌 Try Before You Buy: Test it for free in our SERP API Live Demo

Getting Started

Bright Data アカウントを作成します（新規ユーザーには $5 のクレジットが付与されます）
API key を生成します
step-by-step guide に従って SERP API を構成します

Implementation Methods

Direct API Access

API を使用する最も簡単な方法は、Bright Data の API endpoint へ直接リクエストを送信することです。

cURL Example:

curl https://api.brightdata.com/request \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer API_TOKEN" \
  -d '{
        "zone": "ZONE_NAME",
        "url": "https://www.yandex.com/search/?text=apple+watch+series+10+review&lr=95&lang=en",
        "format": "raw"
      }'

Python Example:

import requests
import json

url = "https://api.brightdata.com/request"

headers = {"Content-Type": "application/json", "Authorization": "Bearer API_TOKEN"}

payload = {
    "zone": "ZONE_NAME",
    "url": "https://www.yandex.com/search/?text=apple+watch+series+10+review&lr=95&lang=en",
    "format": "raw",
}

response = requests.post(url, headers=headers, json=payload)

with open("yandex-scraper-api-result.html", "w", encoding="utf-8") as file:
    file.write(response.text)

print("Response saved!")

Native Proxy-Based Access

この代替方法では、検索結果へ直接アクセスするためにプロキシルーティングを使用します。

cURL Example:

curl -i \
  --proxy brd.superproxy.io:33335 \
  --proxy-user brd-customer-<CUSTOMER_ID>-zone-<ZONE_NAME>:<ZONE_PASSWORD> \
  -k \
  "https://www.yandex.com/search/?text=apple+watch+series+10+review&lr=95&lang=en"

Python Example:

import requests
import urllib3

urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

host = "brd.superproxy.io"
port = 33335
username = "brd-customer-<customer_id>-zone-<zone_name>"
password = "<zone_password>"
proxy_url = f"http://{username}:{password}@{host}:{port}"

proxies = {"http": proxy_url, "https": proxy_url}

url = "https://www.yandex.com/search/?text=apple+watch+series+10+review&lr=95&lang=en"
response = requests.get(url, proxies=proxies, verify=False)

with open("yandex-scraper-api-result.html", "w", encoding="utf-8") as file:
    file.write(response.text)

print("Response saved!")

Note: ネイティブプロキシ方式を使用する場合は、本番利用のために Bright Data の SSL 証明書をインストールすることを推奨します。詳細は SSL Certificate Guide をご覧ください。

👉 full HTML output をご覧ください

lr や lang のようなクエリパラメータは次のセクションで説明します。

Yandex Search Query Parameters

Localization

Region (`lr`)

このパラメータは、検索結果の対象とする地理的な地域または国を定義します。

Region	Code
Moscow	1
Saint-Petersburg	2
USA	84
Canada	95
China	134

Example - "best wireless earbuds" が USA でどのようにランクされるかを確認します:

curl --proxy brd.superproxy.io:33335 \
     --proxy-user brd-customer-<id>-zone-<zone>:<password> \
     "https://www.yandex.com/search/?text=best+wireless+earbuds&lr=84"

Language (`lang`)

2文字の言語コードを使用して、言語の優先設定を指定します:

lang=en - 英語
lang=es - スペイン語
lang=fr - フランス語

Example - スペイン語でスポーツニュースを取得します:

https://www.yandex.com/search/?text=local+sports+news&lang=es

Pagination

Page Number (`p`)

表示する結果ページを制御します:

p=0 - 1ページ目（デフォルト）
p=1 - 2ページ目
p=4 - 5ページ目

各 Yandex SERP ページは通常 10 件の結果を返します。

Example - "nike running shoes" で3ページ目（結果 21-30）をスクレイピングします:

https://www.yandex.com/search/?text=nike+running+shoes&p=2

Time Range

Time Period (`within`)

結果を特定の期間に限定します:

within=77 - 過去 24 時間の結果
within=1 - 過去 2 週間の結果
within=[%pm] - 過去 1 か月の結果

Example - 過去 24 時間の "iPhone 15 review" の結果を取得します:

https://www.yandex.com/search/?text=iphone+15+review&within=77

Device Targeting

Device Type (`brd_mobile`)

シミュレートするデバイスタイプを指定します:

brd_mobile=0 または省略 - ランダムなデスクトップ user-agent
brd_mobile=1 - ランダムなモバイル user-agent
brd_mobile=ios または brd_mobile=iphone - iPhone user-agent
brd_mobile=ipad または brd_mobile=ios_tablet - iPad user-agent
brd_mobile=android - Android phone user-agent
brd_mobile=android_tablet - Android tablet user-agent

Example - レスポンシブ Web サイトのテストを iPhone で検索している状況をシミュレートします:

https://www.yandex.com/search/?text=responsive+website+testing&brd_mobile=ios

Browser Type (`brd_browser`)

シミュレートするブラウザを定義します:

Default (omitted) - ランダムなブラウザ
brd_browser=chrome - Google Chrome
brd_browser=safari - Safari
brd_browser=firefox - Mozilla Firefox

Example - Python チュートリアルを検索する Safari ブラウザをシミュレートします:

https://www.yandex.com/search/?text=how+to+learn+python&brd_browser=safari

Note: brd_browser=firefox と brd_mobile=1 は互換性がないため、組み合わせないでください。

Practical Example

包括的なターゲティングのために、複数のパラメータを組み合わせることができます:

https://www.yandex.com/search/?text=organic+skincare+products
&lr=95
&lang=en
&p=2
&within=1
&brd_mobile=ios
&brd_browser=safari

この検索は次を行います:

カナダのユーザーを対象にします（lr=95）
英語の結果を表示します（lang=en）
2ページ目を表示します（p=2）
過去 2 週間に限定します（within=1）
iPhone ユーザーをシミュレートします（brd_mobile=ios）
Safari ブラウザを使用します（brd_browser=safari）

カナダ市場における最近のオーガニックスキンケア製品トレンドを、iOS モバイルユーザーが閲覧する視点で調査したいスキンケア企業に最適です。

Support & Resources

Documentation: SERP API Documentation
Related APIs:
Use Cases:
- SEO & SERP Tracking
- Travel Industry Data
Additional Reading: Best SERP APIs
Contact Support: support@brightdata.com

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
images		images
yandex-scraper-api-output		yandex-scraper-api-output
yandex-serp-scraper		yandex-serp-scraper
README.md		README.md
setup-serp-api-guide.md		setup-serp-api-guide.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Yandex Search Scraper

Table of Contents

Free Yandex SERP Scraper

Setup Requirements

Quick Start Guide

Sample Output

Limitations

Yandex SERP Scraper API

Key Benefits

Getting Started

Implementation Methods

Direct API Access

Native Proxy-Based Access

Yandex Search Query Parameters

Localization

Region (`lr`)

Language (`lang`)

Pagination

Page Number (`p`)

Time Range

Time Period (`within`)

Device Targeting

Device Type (`brd_mobile`)

Browser Type (`brd_browser`)

Practical Example

Support & Resources

About

Uh oh!

Releases

Packages

Languages

bright-jp/yandex-api

Folders and files

Latest commit

History

Repository files navigation

Yandex Search Scraper

Table of Contents

Free Yandex SERP Scraper

Setup Requirements

Quick Start Guide

Sample Output

Limitations

Yandex SERP Scraper API

Key Benefits

Getting Started

Implementation Methods

Direct API Access

Native Proxy-Based Access

Yandex Search Query Parameters

Localization

Region (lr)

Language (lang)

Pagination

Page Number (p)

Time Range

Time Period (within)

Device Targeting

Device Type (brd_mobile)

Browser Type (brd_browser)

Practical Example

Support & Resources

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Region (`lr`)

Language (`lang`)

Page Number (`p`)

Time Period (`within`)

Device Type (`brd_mobile`)

Browser Type (`brd_browser`)

Packages