By: Dastan Abdulla Goal: Predicting country of origin and language of non-native english speakers Data source: from Speech Accent Archive
-
data_samples/
subfolder containing sample datasets used in the project. -
docs/
folder contains the documentation and write-ups for the project.progress_report.md
contains progress logs throughout this semester.project_plan.md
was my initial project plan.
-
notebooks/
folder contains Jupyter notebooks used for data processing and analysis.audio_processing.ipynb
focuses on processing audio data.biographical_analysis.ipynb
performs preliminary analysis based on biographical information.full_data_cleanup.ipynb
handles the complete data cleanup process.geographic_analysis.ipynb
conducts analysis based on geographic data.language_analysis.ipynb
performs analysis related to language features.phonetic_processing.ipynb
focuses on processing phonetic data.
-
old_notebooks/
contains older or deprecated versions of the notebooks.initial_data_processing.ipynb
this notebook analyzed the kaggle data that I no longer use
-
plots/
folder stores plot graphs and visualizations generated during the analysis. -
.gitignore
specifies files and directories to be ignored by Git version control. -
LICENSE.md
contains licensing information for the project. -
README.md
you are here, yay!
This is Dastan Abdulla's repository for DSLing term project that aims to examine and predict the language and country (which are sociolinguistics ethnic markers) of non-native speakers based on phonetic and audio features.
My guestbook can be found: here