the_watchdogs

Critical to a functioning democracy, the job of the free press is to force the government to be accountable to whom it governs; a role commonly referred to as, watchdogs. However, as the modes of media consumption evolve, and American citizens become as polarized as ever, trust in national news media is declining. Coverage of the January 6th insurrection at the capitol and the events to follow made this issue glaring. There is even disagreement among the use of the word “insurrection” itself. Through the process of data scraping, we will gather articles that discuss the attack at the Capitol, the January 6th House Committee, and the trials of rioters, from the two of the most visited national news websites: CNN and FOX News. We will then use token analysis to inspect the language used to describe this polarizing topic, and compare it across media sources, and over time. Finally, a data visualization component will be implemented allowing users to further examine our data, through the option of isolating variables, times, and topics.

Getting Started with the Virtual Environment

Clone this repository.
From the root directory, the_watchdogs, run poetry install.
Run poetry shell.

Part 1: Gathering the Data

In order to analyze coverage of the January 6th insurrection at the Capitol, article data from NYT, CNN, and FOX must be gathered through the use of web scraping and/or an API. This process can take several minutes to run, so we have saved the json files down in the data directory.

If you would like to run the scraper yourself, the code for completeing this can be found in each source's respective directory: the_watchdogs/cnn/scrape_cnn.py, and the_watchdogs/fox/scrape_fox.py, and each of these sources can be scraped individually in the interpreter by running the following:

$ python3 -m the_watchdogs.cnn.scrape_cnn

$ python3 -m the_watchdogs.fox.scrape_fox

or all at once:

$ python3 -m the_watchdogs.scrape_sources

Part 2: Token and Sentiment Analysis

To transform the raw data scraped from articles on Fox and CNN into a useable cleaned format run the following:

$ python3 the_watchdogs/preprocess.py the_watchdogs/data/fox_articles.json

$ python3 the_watchdogs/preprocess.py the_watchdogs/data/cnn_articles.json

This creates two respective dataframes of cleaned data for each news source in the the data folder in the_watchdogs folder.

Part 3: Data Visualization

To visualize the analyzed data, please run the following command:

$ python3 -m the_watchdogs.data_viz.plot

This will open a port (7997) on the Flask app, and you will be able to see three plots:

Two word clouds, one with CNN data, and one with FOX data.
A line graph showing the number of articles by source, and you can toggle the year.
A bar graph showing the sentiments (5 categories) by news source.

Name		Name	Last commit message	Last commit date
Latest commit History 133 Commits
the_watchdogs		the_watchdogs
.gitignore		.gitignore
README.md		README.md
poetry.lock		poetry.lock
proj-paper.pdf		proj-paper.pdf
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

the_watchdogs

Getting Started with the Virtual Environment

Part 1: Gathering the Data

Part 2: Token and Sentiment Analysis

Part 3: Data Visualization

About

Releases

Packages

Languages

rok12003/30122-project-the_watchdogs

Folders and files

Latest commit

History

Repository files navigation

the_watchdogs

Getting Started with the Virtual Environment

Part 1: Gathering the Data

Part 2: Token and Sentiment Analysis

Part 3: Data Visualization

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages