Restoring Sentiments: Understanding Citizens’ Response to Social Activities on Twitter in U.S. Metropolises During the COVID-19 Pandemic Using Fine-tuned Large Language Model
This repository contains the source codes for the research article entitled Restoring Sentiments: Understanding Citizens’ Response to Social Activities on Twitter in U.S. Metropolises During the COVID-19 Pandemic Using Fine-tuned Large Lan-guage Model by Ryuichi Saito and Sho Tsugawa. This work is currently a preprint and under peer-review.
This repository include the codes as follows:
- Data Collection
- full_archive_search_nyc.ipynb
- full_archive_search_la.ipynb
- full_archive_search_chicago.ipynb
- unique_users.ipynb
- Create Training Data
- join_tweets_separated_by_restriction_type.ipynb
- create_csv_for_amazonmturk.ipynb
- prepare_training_and_test_data.ipynb
- convert_tsv_to_jsonl_for_gpt3_5_finetuning.ipynb
- Create Models
- roberta_large_fine_tuning.ipynb
- Evaluatate Models
- roberta_large_fine_tuning_accuracy.ipynb
- gpt_3_5_turbo_accuracy.ipynb
- Sentiment Classification
- sentiment_classifier_gpt_3_5_turbo.ipynb
- TF-IDF for Sentiment Classification Results
- tf_idf.ipynb
Note
GPT-3.5 Turbo with fine-tuning is trained on the Open AI console, so the code is not included in this repository.