Skip to content

piyushpatel2005/python-data-science

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Python for Data Science

This is a tutorial about Python for Data Science.

What will you find here?

This repo is for Python for Data Science, so you will find Python toolkit tutorials here including Numpy, Scipy, Pandas, matplotlib, IPython, Jupyter notebook, etc.

Requirements

Python v2.7 or v3.6 or higher pip to install all tools or easy_install

You can install packages using:

pip install <pkg>

or

easy_install <pkg>

You can update packages using:

pip install -U numpy==1.9.1 to version 1.9.1

or

pip install -U numpy to the latest version

or

easy_install --upgrade numpy==1.9.1

  • We can also use Python distribution which includes various packages in-built. Anaconda, Enthought Canopy, PythonXY, WinPython, etc.

If you need a Python refresher, check out my other Python repository

Python for Data science

List of Anaconda commands

conda install <package_name> # install a package
conda remove <package_name> # remove a package
conda install <pkg1> <pkg2> # install multiple packages
conda search "*beautiful*" # search for package using some word
conda create -n <env_name> [list of packages] # create new virtual environment with list of packages
conda create -n <env_name> python=2 [list of packages] # create virtual environment with python 2
conda env export > environment.yaml # create export of the environment like requirements.txt file
conda env create -f environment.yaml # create virtual environment using environment file
conda env list # list all virtual environments
conda env remove -n <env_name> # remove a virtual environment

Table of Contents:

Theory

Problem Solving Approach to Data Science problems

Data Requirements and Collection

Data cleaning

Data Modeling and Evaluation

Unix basics

Python Basics

Python Basics

String operations

Tuples

Lists

Dictionaries

Sets

Conditionals

Loops

Functions

Objects and Classes

Reading Files

Writing Files

Loading and Viewing Data using Pandas

SQL

Connecting to IBM DB2 in Jupyter

Querying Database

SQL Magic functions for SQL

Analyzunbg Data

Assignment

Data Science Libraries

NumPy 1D arrays

NumPy 2D arrays

Data Analysis intro using Pandas

Intro to Matplotlib

Area Plots, Histograms and Bar charts

Pie Charts, Scatter plots and Bubble Plots

Waffle Charts, Word clouds and Regression Plots

Generating Maps

  1. NumPy Library

    Practical NumPy

    NumPy Practicals 2

    NumPy Arrays

    NumPy notebook

    Satellite Image Analysis

  2. Pandas Library

    Practical Pands

    Pandas Practicals 2

    Pandas Intro

  3. Splunk

Harvard Data Science Notebooks

Basics of Python

Pandas and Matplotlib plotting basics

Web scrapping This lab uses Pandas to read CSV from movielens datasets.

Scraping web using BeautifulSoup

Accessing SQL with Pandas

Probablity

Statistics

Frequency distributions

About

Python Data science repository

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published