Skip to content

Greyhouse-Consulting/LAB4

Folders and files

NameName
Last commit message
Last commit date

Latest commit

4d32fe5 · Jan 11, 2020

History

14 Commits
Dec 23, 2019
Jan 10, 2020
Jan 11, 2020

Repository files navigation

Preconditions

  • Python 3.7 installed
  • Pyspark installed and correctly configured
  • Knowledge on how to run jupyter notebooks

Instructions

The code for this exercise is written as Jupyter notebook file. To be able to run please follow these steps.

  1. Download the dataset at
    http://stat-computing.org/dataexpo/2009/the-data.html
    or
    https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/HG7NV7
  2. Extract the files to a directory
  3. Name the files in such way that they begin with the year. 2009_some_name.csv etc
  4. Open lab4.ipynb file and modify the variable fileLocation to point to the directory created in step 2.
  5. Run lab4.ipynb from jupyter notebook

About

Machine Learning With Big Data - Övning 1

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published