Skip to content

lucianli123/booklab-project

Repository files navigation

Exploring shared text in elocution manuals

Data Files

  • hathi_ia_texts.csv : combined metadata and text for all books
  • all_counts.csv : n-gram occurences per book
  • top1000_w_cluster.csv : ngrams annotated with potential cluster/speech origin

Code

  • proc_into.ipynb : cleaning and downloading files
  • NgramExperiments.ipynb : ngram generation
  • visualization.ipynb: clustering and visualization

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published