Welcome to DATA/MSML602 - Introduction to Data Science! This is our course repository for all programming related assignments and materials.
Over the course of the semester, you will work with a variety of software packages, including Python Pandas, Jupyter Notebook, and others. Installing those packages and getting started can often be a hassle because of software dependencies, so we instead provide the assignments as Google CoLab notebooks.
For those unfamiliar with GitHub, below is some basic information about the platform.
Git is one of the most widely used version control management systems today, and invaluable when working in a team. GitHub is a web-based hosting service built around git -- it supports hosting git repositories, user management, etc. There are other similar services, e.g., bitbucket.
We will use GitHub to distribute the assignments, and other class materials. Our use of git/github for the class will be minimal; however, we encourage you to use it for collaboration for your class project, or for other classes.
Just Cloning the Class Repository
You don't need a GitHub account for just cloning the class repository. From the commandline, just do:
git clone https://github.com/mss423/data602-fall2023.git
You can do git pull (from within the data602-fall2023 directory) to fetch the newly added material.
NOTE: If you are having trouble installing git, you can just download the files instead (as a zipfile), although updating may become tedious.
The programming assignments will distributed to be run using Google CoLab. These environments give you access to all the computational resources needed to complete the programs without having to install any dependencies / packages to your local machines.