RIS_Final_Version

SECTION1: terms for file names

N: NLTK
B: BERT
Neu: neutral sentiment posts
dw: data wrangling
sth + ed: the file after this action e.g., Ned refers to files after NTLK scoring; Neucleaned refers to dataset after neutral be cleaned
skl: sklearn
sns: seaborn
relvscore: relevant scoring
comp: company related data
gov: government related data
iden: which NLTK and BERT agrees in sentiment
ALL: data generated all 24 months in ONE table
valley: data from valley sentiment month
peak: data from peak sentiment month
FOUR: data including 1) all tweets with NLTK score; 2) all tweets with BERT score; 3) all tweets which NLTK and BERT agrees in sentiment 4) all tweets
with two scores (one for NLTK one for BERT)
** temp_ AND ALL_: all the final version of data visualization should be found in files of these two types of names

SECTION2: names of all coding files

NOTIFICATION: for data wrangling, each steps contains 24 files named from Jan2018 to Dec2019 (1801 - 1912)

T_mined.py // this is the file for mining data from Twitter output: csv file with columns = ['author_id', 'created_at', 'id', 'lang', 'text'] output file name: output_year+month.csv e.g., "out_1801.csv"
D_clean.py // this is the file for cleaning the unrelated information in twwets e.g. hashtags, URL links with regular expression output: stuctured tabular form of data with columns = ['author_id', 'created_at', 'id', 'lang', 'text'] output file name: dwed_year+month.csv e.g., "dwed_1801.csv"
NLTK_score.py // this is the file for getting NLTK score of tweets output: tabular form of data with original columns + NLTK score output file name: Ned_year+month.csv e.g., ""Ned_1801.csv"
NLTK_noneu.py // this is the file for cleaning rows of tweets with has neutral sentiment outcome output: tabular form of data with non-neutral sentment tweets output file name: Neucleaned_year+month.csv e.g., 'Neucleaned_1801.csv'
(file not in this package) // this is the file of BERT model runned in Google Colab output: tabular form of data with original columns + BERT score output file name: bertout_year+month.csv
outcome_ana.py // this is the file to combine all 24 months files into one table output: an overall giant tabular form of data with all outcomes with column = ['author_id', 'created_at', 'text', 'score','BERTscore'] output file name: multi files include a) files with tweets with NLTK and BERT have the same sentiment b) files with tweet with NLTK and BERT have different sentiment
outcome_ana_combination.py; outcome_ana_temp.py; outcome_ana_v2.py; outcome_ana_v3.py // all contains some ways of sorting data output file name: ALL_for_compare.csv || ALL_for_combine.csv, etc...

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.idea		.idea
.DS_Store		.DS_Store
ALL_FOUR.csv		ALL_FOUR.csv
ALL_for_average.csv		ALL_for_average.csv
ALL_for_average_cover.csv		ALL_for_average_cover.csv
ALL_for_combine.csv		ALL_for_combine.csv
ALL_for_compare.csv		ALL_for_compare.csv
ALL_for_peak.csv		ALL_for_peak.csv
ALL_for_relvscore.csv		ALL_for_relvscore.csv
ALL_for_valley.csv		ALL_for_valley.csv
ALL_sklearn.csv		ALL_sklearn.csv
ALL_sklearn_comp.csv		ALL_sklearn_comp.csv
ALL_sklearn_gov.csv		ALL_sklearn_gov.csv
D_clean.py		D_clean.py
Diff_for_peak.csv		Diff_for_peak.csv
Diff_for_valley.csv		Diff_for_valley.csv
Identity_for_all.csv		Identity_for_all.csv
Identity_for_all_cover.csv		Identity_for_all_cover.csv
Identity_for_valley.csv		Identity_for_valley.csv
NLTK_noneu.py		NLTK_noneu.py
NLTK_score.py		NLTK_score.py
Ned_1801.csv		Ned_1801.csv
Ned_1802.csv		Ned_1802.csv
Ned_1803.csv		Ned_1803.csv
Ned_1804.csv		Ned_1804.csv
Ned_1805.csv		Ned_1805.csv
Ned_1806.csv		Ned_1806.csv
Ned_1807.csv		Ned_1807.csv
Ned_1808.csv		Ned_1808.csv
Ned_1809.csv		Ned_1809.csv
Ned_1810.csv		Ned_1810.csv
Ned_1811.csv		Ned_1811.csv
Ned_1812.csv		Ned_1812.csv
Ned_1901.csv		Ned_1901.csv
Ned_1902.csv		Ned_1902.csv
Ned_1903.csv		Ned_1903.csv
Ned_1904.csv		Ned_1904.csv
Ned_1905.csv		Ned_1905.csv
Ned_1906.csv		Ned_1906.csv
Ned_1907.csv		Ned_1907.csv
Ned_1908.csv		Ned_1908.csv
Ned_1909.csv		Ned_1909.csv
Ned_1910.csv		Ned_1910.csv
Ned_1911.csv		Ned_1911.csv
Ned_1912.csv		Ned_1912.csv
Neucleaned_1801.csv		Neucleaned_1801.csv
Neucleaned_1802.csv		Neucleaned_1802.csv
Neucleaned_1803.csv		Neucleaned_1803.csv
Neucleaned_1804.csv		Neucleaned_1804.csv
Neucleaned_1805.csv		Neucleaned_1805.csv
Neucleaned_1806.csv		Neucleaned_1806.csv
Neucleaned_1807.csv		Neucleaned_1807.csv
Neucleaned_1808.csv		Neucleaned_1808.csv
Neucleaned_1809.csv		Neucleaned_1809.csv
Neucleaned_1810.csv		Neucleaned_1810.csv
Neucleaned_1811.csv		Neucleaned_1811.csv
Neucleaned_1812.csv		Neucleaned_1812.csv
Neucleaned_1901.csv		Neucleaned_1901.csv
Neucleaned_1902.csv		Neucleaned_1902.csv
Neucleaned_1903.csv		Neucleaned_1903.csv
Neucleaned_1904.csv		Neucleaned_1904.csv
Neucleaned_1905.csv		Neucleaned_1905.csv
Neucleaned_1906.csv		Neucleaned_1906.csv
Neucleaned_1907.csv		Neucleaned_1907.csv
Neucleaned_1908.csv		Neucleaned_1908.csv
Neucleaned_1909.csv		Neucleaned_1909.csv
Neucleaned_1910.csv		Neucleaned_1910.csv
Neucleaned_1911.csv		Neucleaned_1911.csv
Neucleaned_1912.csv		Neucleaned_1912.csv
README.md		README.md
Ranking.py		Ranking.py
Ranking2.py		Ranking2.py
Ranking_v3.py		Ranking_v3.py
T_mined.py		T_mined.py
bar_generate.py		bar_generate.py
bert_for_compare.csv		bert_for_compare.csv
bert_for_sns.csv		bert_for_sns.csv
bert_togo.csv		bert_togo.csv
bertout_1801.csv		bertout_1801.csv
bertout_1802.csv		bertout_1802.csv
bertout_1803.csv		bertout_1803.csv
bertout_1804.csv		bertout_1804.csv
bertout_1805.csv		bertout_1805.csv
bertout_1806.csv		bertout_1806.csv
bertout_1807.csv		bertout_1807.csv
bertout_1808.csv		bertout_1808.csv
bertout_1809.csv		bertout_1809.csv
bertout_1810.csv		bertout_1810.csv
bertout_1811.csv		bertout_1811.csv
bertout_1812.csv		bertout_1812.csv
bertout_1901.csv		bertout_1901.csv
bertout_1902.csv		bertout_1902.csv
bertout_1903.csv		bertout_1903.csv
bertout_1904.csv		bertout_1904.csv
bertout_1905.csv		bertout_1905.csv
bertout_1906.csv		bertout_1906.csv
bertout_1907.csv		bertout_1907.csv
bertout_1908.csv		bertout_1908.csv
bertout_1909.csv		bertout_1909.csv
bertout_1910.csv		bertout_1910.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RIS_Final_Version

About

Uh oh!

Releases

Packages

Languages

zhuchloe/RIS_Final_Version

Folders and files

Latest commit

History

Repository files navigation

RIS_Final_Version

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages