fb-ad-archive-scraper

Scraper for Facebook's Archive of Ads with Political Content ... until Facebook provides an API.

fb-ad-archive-scraper will produce:

CSV containing the text and metadata of the ads.
Screenshots of each ad.
A README file.

Like any scraper, fb-ad-archive-scraper is fragile. It will break if Facebook changes the structure / code of the Archive. If fb-ad-archive-scraper breaks, let me know.

Tickets / PRs are welcome.

Install

Clone the repo:

 git clone https://github.com/justinlittman/fb-ad-archive-scraper.git

Change to the directory:
```
 cd fb-ad-archive-scraper
```

Optionally, create a virtual environment:

 virtualenv -p python3 ENV
 source ENV/bin/activate

Install requirements:
```
 pip install -r requirements.txt
```
Install Chromedriver. On a Mac, this is:
```
 brew cask install chromedriver
```
If already installed, upgrade Chromedriver with:
```
 brew cask upgrade chromedriver
```

Usage

    usage: scraper.py [-h] [--limit LIMIT] [--headed]
                      email password query [query ...]
    
    Scrape Facebook's Archive of Ads with Political Content
    
    positional arguments:
      email          Email address for FB account
      password       Password for FB account
      query          Query
    
    optional arguments:
      -h, --help     show this help message and exit
      --limit LIMIT  Limit on number of ads to scrape
      --headed       Use a headed chrome browser

For example:

    python scraper.py fbuser@gmail.com password pelosi

Notes:

fb-ad-archive-scraper uses a headless Chrome browser. This means that you will not see the browser at work.
The output of each run will be placed in a separate directory and include a README, CSV file, and PNG images.

The appoach of extracting data from XHRs came from Ranjit Hatnagar.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
fb_ad_id_scraper.py		fb_ad_id_scraper.py
import_pp.py		import_pp.py
requirements.txt		requirements.txt
scraper.py		scraper.py
seeds.txt		seeds.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fb-ad-archive-scraper

Install

Usage

About

Releases

Packages

Languages

License

lauraedelson/fb-ad-archive-scraper

Folders and files

Latest commit

History

Repository files navigation

fb-ad-archive-scraper

Install

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages