Skip to content

A (computational) linguistic analysis of code-switching Russian-French in Lev Tolstoi's Война и мир (War and Peace).

License

Notifications You must be signed in to change notification settings

lykoerber/cs-vojna-i-mir

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cs-vojna-i-mir

A (computational) linguistic analysis of code-switching Russian-French in Lev Tolstoi's Война и мир (War and Peace).

Installation

To create a virtual environment and install all required packages, run bash install.sh. Download the dataset from WikiSource and place the .txt files in a directory corpus.

Usage

scripts:

  • preprocess.py: data preprocessing and computing CS types -> creates cs_*.csv
  • codeswitch.py: annotate intrasentential CS instances with PoS tags, lemmata, dependency and morphological information -> creates features.csv
  • analysis.ipynb: analyse outputs

outputs:

  • cs_*.csv: overview of CS instances of each volume
  • features.csv: linguistic features of intra-sentential CS instances

About

A (computational) linguistic analysis of code-switching Russian-French in Lev Tolstoi's Война и мир (War and Peace).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published