Evaluating Historical Language Models for literary research

This repository contains the research done as part of my thesis.

Structure

the data folder contains all the used data: files for the token-level experiments: eng-danc.json eng-fina.json eng-houn.json eng-nava.json eng-redh.json eng-scan.json eng-spec.json file for the sentence-level experiments: speckled-band-chunk-sentiment.json and the modern benchmark: test_benchmark.csv train_benchmark.csv
the token-level experiment is contained in the folder by the same name inside there are: notebooks that explore the data: BERT-explore, BERT-explore2, BERT-token_positive, neutral_interval, token_level_exploration notebooks with BERT doing sentiment analysis: BERT-token_system and BERT-token_system-seeds notebooks with TuringBERT doing sentiment analysis: TuringBERT-token_system and TuringBERT-token_system-seeds files with results (the csv files)
the sentence-level experiment is contained in the folder by the same name inside there are: notebooks with BERT doing sentiment analysis on the modern benchmark: BERT-benchmark and BERT-benchmark-seeds notebooks with BERT doing sentiment analysis on the historical benchmark: 10cross_BERT_sentence_level and 10cross_BERT_sentence_level-seeds notebooks with TuringBERT doing sentiment analysis on the modern benchmark: TuringBERT-benchmark and TuringBERT-benchmark-seeds notebooks with TuringBERT doing sentiment analysis on the historical benchmark: 10cross_TuringBERT_sentence_level and 10cross_TuringBERT_sentence_level-seeds files with results (the tsv files)
the thesis report

Running the code might require importing some common packages, they are all mentioned in the begging of each notebook.

You must have the BERT model and a TuringBERT model downloaded.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
data		data
sentence-level experiment		sentence-level experiment
src		src
token-level-experiment		token-level-experiment
.gitignore		.gitignore
Evaluating_Historical_Language_Models_for_literary_research.pdf		Evaluating_Historical_Language_Models_for_literary_research.pdf
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt