Social Media and Text Mining Workshop with `R`

Workshop material on working with social media data and text mining methods in R

Made with woRkshoptools

Part of the conference: „Forschung zur Digitalisierung in der kulturellen Bildung“ (29-09-2022)

Contact: Veronika Batzdorfer (veronika.batzdorfer@gesis.org)

Background

Social media are central sites of collective opinion formation and form an important basis for describing and explaining social phenomena (e.g., online radicalisation). However, when working with this type of data, decisions in all phases of the research cycle (from data collection to pre-processing steps to analytical decisions) carry risks of bias for validity and reliability aspects.

About

This workshop will include an introduction to how large amounts of text data from Twitter, which are openly available, can be made accessible and usable for research purposes. It will combine conceptual considerations and practical applications in R.

Strategies to collect and process textual data with application programming interfaces (APIs) using common R tools.
Potentials of bias in the research data cycle
Basics of natural language processing (NLP), data cleaning (e.g. with 'quanteda' or 'textclean') and application of common NLP tools for automated text analysis
Outlook on topic modelling (or word embeddings)
Bias and ethics in NLP

Requirements

Twitter data: Kaggle Data Dump, Depression Tweets
Download & Installing R from: https://cran.r-project.org/
Download & Installing RStudio from: https://www.rstudio.com/
Dependencies

pkgs <- c("here", "lubridate", "quanteda", "quanteda.textstats", "tidyverse", 
"academictwitteR", "tibble", "kableExtra", "tidytext", 
"textclean", "academictwitteR")

install.packages(pkgs)

Time	Content
09:00 - 10:30	Concepts & challenges when analysing social web data https://github.com/nika-akin/-Social-Media-and-Text-Mining-Workshop-2022/blob/main/content/sessions/1_1_analyse_social_web_data.pdf
10:30 - 11:00	Coffee break
11:30 - 12:30	Getting Started with Twitter data: (i) Sampling, (ii) Pre-processing/ data wrangling & (iii) Basics of textual analyses (frequencies/ co-occurences/ networks) https://htmlpreview.github.io/?https://github.com/nika-akin/-Social-Media-and-Text-Mining-Workshop-2022/blob/main/content/sessions/2_1.nb.html
12:30 - 13:30	Lunch
13:30 - 15:00	Twitter Demo & Crawling Social web data https://github.com/nika-akin/-Social-Media-and-Text-Mining-Workshop-2022/blob/main/content/sessions/3_1_analyse_social_web_data.pdf
15:00 - 15:30	Coffee break
15:30 - 17:00	Outlook Advanced NLP techniques (e.g., Topic Modeling) & Social web data collection; Bias and Ethics with NLP https://github.com/nika-akin/-Social-Media-and-Text-Mining-Workshop-2022/blob/main/content/sessions/4_1_Ausblick.pdf

Data

Twitter Features

Feature ID	Type	Description
post_id	Numeric	identifier of tweet
followers	Numeric	number of followers in profile
friends	Numeric	number of friends in profile
post_created	character	date of posting tweet
post_text	character	text of original tweet
user_id	Numeric	identifier of user
label	Numeric	depression categorization: 1 = depression tweet, 2 = non-depression
favourites	Numeric	number of external favorites of the tweet
user_id	Numeric	identifier of user

Name		Name	Last commit message	Last commit date
Latest commit History 108 Commits
.Rproj.user		.Rproj.user
.quarto		.quarto
_book		_book
content/sessions		content/sessions
README.md		README.md
_quarto.yml		_quarto.yml
cover.png		cover.png
index.log		index.log
index.qmd		index.qmd
index.tex		index.tex
intro.qmd		intro.qmd
miningtext.Rproj		miningtext.Rproj
references.bib		references.bib
references.qmd		references.qmd
summary.qmd		summary.qmd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Social Media and Text Mining Workshop with `R`

Background

About

Requirements

Contents

Data

Twitter Features

About

Releases

Packages

Languages

nika-akin/Social-Media-and-Text-Mining-Workshop-2022

Folders and files

Latest commit

History

Repository files navigation

Social Media and Text Mining Workshop with R

Background

About

Requirements

Contents

Data

Twitter Features

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Social Media and Text Mining Workshop with `R`

Packages