Skip to content
This repository has been archived by the owner on Aug 5, 2020. It is now read-only.
/ mobify-data-guide Public archive

πŸ“š List of readings that would be useful in getting started on with working with any data set.

Notifications You must be signed in to change notification settings

mobify/mobify-data-guide

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

31 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

                                                .ⁿ─
                  `:-.                        Ξ“     ΒΌ  
                 -yyyyys+:`                  β•›       β••
                /yβ•”β–“β–“yyyyy`  +o+/:-.        Γ‘         β••
              `oβ–„β–“β–“β–“β–“β–“β–“β–„y-  :hhhhhhhs      β•’           -     
             `sβ•™β–€β–€β–‘β–“β–“β–‘β–€β–€y+   yhhhhhhhy    β•’                ,.   
         `-/oyyyyyyβ–“β–“yyo   :hhhhhhhhs                    βŒ‚β•ž    Β½    
      -/oyyyyyyyyyyβ–“β–“yo`   shhhhhhhhs    β”˜              /  k     Β½   
    +yyyyyyyyyyyyyyβ–“β–“+    `hhhhhhhhho   β•›              βŒ‚     Β½     Γ―   
    +yyyyyyyyyyyyyyβ–“β–“     .hhhhhhhhho  β”˜              ;        Β½     -   
    `yyyyyyyyys+/. β–“β–“      yhhhhhhhhhs`              ;           Y     β•š
     -yso+/-.`     β–“β–“      .yhhhhhhhhhy.            βŒ‚              β•˜      \,   
             ```...β–“β–“`      `shhhhhhhhhy.         .                   -      ⁿ.
       -://++++++++β–“β–“++/-`    /yhhhhhhhhy.      .                        ^-     ~,   
       /+++++++++++β–“β–“+++++/.   `ohhhhhhy+`   .⌐                              ⁿ,   β–“β–„
      β–‘β–“β–“β–‘β–“β–“β–‘β–“β–“β–‘β–“β–“β–‘β–“β–“β–‘β–“β–“β–‘β–“β–“β–‘β–“β–“β–‘β–“β–“β–‘β–“β–“β–‘β–“β–“β–‘β–“β–“β–‘β–“β–“β–‘β–“β–“β–‘β–“β–“β–‘β–“β–“β–‘β–“β–“β–‘β–“β–“β–‘β–“β–“β–‘β–“β–“β–‘β–“β–“β–‘β–“β–“β–‘β–“β–“β–‘β–“β–“β–‘β–“β–“β–‘β–“β–“β–“β–“
       `+++++++++++β–“β–“++++++++++-`  -`                                             β–“β–€
        `......-...β–“β–“+++++++++++/`
                   β–“β–“`-/+++++++:`
                   β–“β–“    `.:/+:`       

Computer, compute to the last digit the value of pi.

Mobify Data Guide

Welcome to Mobify's data guide! We have provided a list of readings that would be useful in getting started on with working with any data set.

πŸ€” Why this guide?

This is an open-source guide that is intended to gather feedback from various people that have worked with data teams. In Mobify, we work closely with talents from wide variety of backgrounds.

πŸ”₯ πŸ€” 😎 πŸ• πŸš€ πŸ’­ 🍾 😈 βš– πŸ’•

We hope that by opening some of our onboarding materials, this will give you a taste for what is our style of work, as well as helping out candidates on interviews, or data hackathons.


πŸ”– Legend

We denote each type of articles with Emoji: πŸ“œ πŸ› πŸ“š

  • πŸ“œ Articles - expect around 10-15 mins reading time
  • πŸ› Tutorials - expect minimal half day exercise
  • πŸ“š Advance Reference (optional readings) - vary in reading time

What happens if I am preparing for an interview/hackathon tomorrow?

We recommend you at least go through the articles and take the:


πŸ• Content of this guide

This is meant to be a list of selected resources on what we think is the minimal set to bootstrap to working on data challenges.

See CONTRIBUTING.md for contributing guideline


πŸ’­ Getting started

So you would like to work on data eh? There are many great resources to get you started on the path to work with data. We recommend a few of these articles:


πŸš€ Data Science 101

If you come from a non statistics/machine learning background, this will be a good starting point.


πŸš… Engineering tools 101

Learning to code is an important step in becoming data literate. There are 3 main engineering tools we use.

Python + Pandas

At Mobify, we are a Python shop which makes us focus our analysis on Python + Pandas. Below is some of our favourite tutorial to get started:

SQL

SQL is used everywhere.

Command line

Being comfortable with command line will help a great deal in your work. We recommend taking πŸ› Codecademy command line course for this.

Git

Git solves 2 big communication challenge working as a team:

The πŸ› Codecademy git course is our recommended way to learn git.


πŸš€ Setting up your data dojo

So are you ready to get started? One thing we found correlated to the ability of interview candidates is the ability to get comfortable with the environment that you will use during the interview. We try to give a few tips.

Also, see Disclaimer - that Mobfiy is a Python shop and likely to be Python focus for our data dojo! Our tool of choice is Jupyter notebook

Hosted version

πŸ“š Data Science workbench is a great way to get started. It presents you with a hosted version of the notebook. And the onboarding was useful.

Local setup (Advance)

If you want to setup a self-hosted version of Jupyter, you might want to check out πŸ› this tutorial

Getting familiar with Jupyter notebook

πŸ“œ Short cut keys for Jupyter will make you a Jupyter pro.


😎 Think about the problem

As most of us being proud of diving into our problems, and present our solutions. Over time, we learn a few tools to align colleagues/fellow hackers with our thoughts. Here are a few:

Focus on the right problem to work on

If I had an hour to solve a problem I'd spend 55 minutes thinking about the problem and 5 minutes thinking about solutions ― Albert Einstein

It is a surprisingly difficult skill to learn how to work on the right problem. Here are a few tips:

Communicating the results

I'm not a great programmer; I'm just a good programmer with great habits - Kent Beck

Writing a readable notebook and explaining the result is a great habit. πŸ“œ Clean code in Jupyter notebooks in our go to guide in how to create a clean notebook.

We would like to Keep your analysis reproducible

Reproducibility is important because it is the only thing that an investigator can guarantee about a study. -- Roger Peng


😈 Disclaimer

We are a data shop with engineering focus shop and is opinionated towards selecting easy to get started tools that work with our well with our stack (e.g. Python, Jupyter Notebook) - this is a way that we found it works well for us.

We have no affiliation to any of the companies mentioned in this list.

About

πŸ“š List of readings that would be useful in getting started on with working with any data set.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published