From f6a35843c5d7a9741fffc28444bf94ef3ef45960 Mon Sep 17 00:00:00 2001 From: Eleonore9 Date: Sun, 19 Jul 2015 21:27:19 +0100 Subject: [PATCH 1/4] More (data) science + virtualenv --- .gitignore | 3 +++ python-ecosystem.md | 18 +++++++++++++----- 2 files changed, 16 insertions(+), 5 deletions(-) diff --git a/.gitignore b/.gitignore index bee8a64..97d3235 100644 --- a/.gitignore +++ b/.gitignore @@ -1 +1,4 @@ __pycache__ + +*~ +#* \ No newline at end of file diff --git a/python-ecosystem.md b/python-ecosystem.md index e940e12..dbfbb18 100644 --- a/python-ecosystem.md +++ b/python-ecosystem.md @@ -111,6 +111,8 @@ Web Dev **pip**: the standard Python command-line package installer. "pip install package-name" is the basic command. There used to be one called `easy_install`, essentially the same, now is less used. You can also manually download the source code for a package and install it by running the setup.py file, `python setup.py install`. +**vitural environments** or [virtualenvs](http://docs.python-guide.org/en/latest/dev/virtualenvs/) constitute a very useful python tool to keeps dependencies in separate places and avoid conflicts. Packages can be installed globally or inside a virtualenv. + ## Web dev: @@ -148,17 +150,23 @@ Web Dev -## Science stuff: +## (Data) Science stuff: + +**numpy & scipy** are two of the most popular scientific python libraries. Numpy has lots of tools for doing maths on big arrays of numbers, and scipy has lots of stats and analysis functions. scipy depends on numpy. If you're on windows, they may not "pip install" cleanly, so go to the scipy website and download installers, or look into Conda. + +**IPyton and the IPython Notebook** (now [Jupyter project](http://jupyter.org/)) is an HTML notebook environment based on iPython shell. As we saw earlier, Ipython is an enhanced Python interpreter. The Ipython Notebook allows to combine code execution, text and plots and has become the go-to coding environment for the scientific Python world. -**numpy & scipy** are two of the most popular scientific python libraries. Numpy has lots of tools for doing maths on big arrays of numbers, and scipy has lots of stats and ananalysis functions. scipy depends on numpy. If you're on windows, they may not "pip install" cleanly, so go to the scipy website and download installers, or look into Conda. +**pandas** is popular data analysis library that reproduces a lot of functionalities found in R and contains plotting functionalities. It is used a lot for dealing with tables of data (CSV, Excel) but also handles other formats like text files, HTML, json, SQL... -**IPyton and the IPython Notebook** - as we saw earlier, Ipython is an enhanced Python interpreter. The Ipython Notebook is the real darling of the scientific Python world though, it gives you a sort of interactive notebook interface, a bit like matlab I'm told. +**scikit-learn** is machine learning library built on top of SciPy/Numpy. It integrates a wide range of classification, clustering and regression algorithms. -**pandas** is popular for dealing with tables of data. Lets you do excel-style pivottables, for example +**nltk** is a natural language processing library. It provides tools for the classification, tokenization, stemming, tagging, parsing... of human language data. -**matplotlib** is a well-revered tool for drawing graphs +**matplotlib** is a well-revered plotting library for Python. +**seaborn** is a data visualisation library based on matplotlib (but much nicer visually). +**bokeh** is an interactive data visualisation tool in the browser (like D3.js). From 598263c36700b8c615d0e68b249592a105b1de11 Mon Sep 17 00:00:00 2001 From: Eleonore9 Date: Sun, 19 Jul 2015 21:33:01 +0100 Subject: [PATCH 2/4] Links to tutorials for the scientific libraries --- challenges/Readme.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/challenges/Readme.md b/challenges/Readme.md index bf6cb13..b2214dd 100644 --- a/challenges/Readme.md +++ b/challenges/Readme.md @@ -25,3 +25,16 @@ Check out the great tutorials at [newcoder.io](http://newcoder.io/) and learn ho * Tk for graphics and GUI stuff and lots more... + + +_____ + +###Extras resources for Python Scientific libraries: +* scientific Python lectures: https://github.com/jrjohansson/scientific-python-lectures +* data science Python: https://github.com/donnemartin/data-science-ipython-notebooks + +* official Pandas tutorial: http://pandas.pydata.org/pandas-docs/stable/tutorials.html +* Pandas lessons & tutorials: https://bitbucket.org/hrojas/learn-pandas +* Scikit-learn tutorial by Gael Varoquaux at EP 2014: https://github.com/GaelVaroquaux/sklearn_europython_2014 +* official Seabrn tutorial: https://web.stanford.edu/~mwaskom/software/seaborn/tutorial.html#tutorial +* official Bokeh tutorial: http://bokeh.pydata.org/en/latest/docs/tutorials.html From b9954c5cb54a823fb6b3f919666bc5b1f7b69421 Mon Sep 17 00:00:00 2001 From: Eleonore9 Date: Sun, 19 Jul 2015 21:36:06 +0100 Subject: [PATCH 3/4] scientific libraries resources --- challenges/Readme.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/challenges/Readme.md b/challenges/Readme.md index b2214dd..ca41454 100644 --- a/challenges/Readme.md +++ b/challenges/Readme.md @@ -30,9 +30,11 @@ and lots more... _____ ###Extras resources for Python Scientific libraries: +1- (Data) Science resources * scientific Python lectures: https://github.com/jrjohansson/scientific-python-lectures * data science Python: https://github.com/donnemartin/data-science-ipython-notebooks +2- Library specific * official Pandas tutorial: http://pandas.pydata.org/pandas-docs/stable/tutorials.html * Pandas lessons & tutorials: https://bitbucket.org/hrojas/learn-pandas * Scikit-learn tutorial by Gael Varoquaux at EP 2014: https://github.com/GaelVaroquaux/sklearn_europython_2014 From 4e35a6dc0e766124a4220f378a327c1b39cf32b4 Mon Sep 17 00:00:00 2001 From: Eleonore9 Date: Sun, 19 Jul 2015 21:51:13 +0100 Subject: [PATCH 4/4] Added to the resources --- challenges/Readme.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/challenges/Readme.md b/challenges/Readme.md index ca41454..4631242 100644 --- a/challenges/Readme.md +++ b/challenges/Readme.md @@ -31,12 +31,13 @@ _____ ###Extras resources for Python Scientific libraries: 1- (Data) Science resources -* scientific Python lectures: https://github.com/jrjohansson/scientific-python-lectures -* data science Python: https://github.com/donnemartin/data-science-ipython-notebooks +* scientific Python lectures: http://github.com/jrjohansson/scientific-python-lectures +* data science Python: http://github.com/donnemartin/data-science-ipython-notebooks 2- Library specific * official Pandas tutorial: http://pandas.pydata.org/pandas-docs/stable/tutorials.html -* Pandas lessons & tutorials: https://bitbucket.org/hrojas/learn-pandas +* Pandas lessons & tutorials: http://bitbucket.org/hrojas/learn-pandas * Scikit-learn tutorial by Gael Varoquaux at EP 2014: https://github.com/GaelVaroquaux/sklearn_europython_2014 -* official Seabrn tutorial: https://web.stanford.edu/~mwaskom/software/seaborn/tutorial.html#tutorial +* simplified text processing with textBlob: https://textblob.readthedocs.org/en/dev/ +* official Seaborn tutorial: http://web.stanford.edu/~mwaskom/software/seaborn/tutorial.html#tutorial * official Bokeh tutorial: http://bokeh.pydata.org/en/latest/docs/tutorials.html