Data Analysis on Us Bike Share Data. This project contains three csv file as datasets. They are chicago.csv, new_york_city.csv and washington.csv. In this project is about accepting these datasets and analysing the data.
This project need 2 major libraries NumPy & Pandas
- Enter
conda install numpy
- Enter
conda install pandas
Randomly selected data for the first six months of 2017 are provided for all three cities. All three of the data files contain the same core six (6) columns:
- Start Time (e.g., 2017-01-01 00:07:57)
- End Time (e.g., 2017-01-01 00:20:53)
- Trip Duration (in seconds - e.g., 776)
- Start Station (e.g., Broadway & Barry Ave)
- End Station (e.g., Sedgwick St & North Ave)
- User Type (Subscriber or Customer)
The Chicago and New York City files also have the following two columns:
- Gender
- Birth Year
- In this project, the three datasheets are in puts files in which data is sorted according to the requirements.
- These has an interactive interface in which user have the choice of choosing city whose data is to be sorted.
- Once, required city data is chosen the user can choose the how the data can be sorted. i.e. the user is given three choices which are sort by month, day or nether.
The data is accepted and sorted according to the user’s specifications. Once, required data is sorted out the data is analysed and sorted accordingly. i.e.
-
Popular travel time.
- Most common month.
- Most common day of the week.
- Most common hour of the day.
-
Popular station and trip.
- Most common start station
- Most common end station
- Most common trip (i.e. most frequent combination start station and end station)
-
Travel duration
- Total travel time
- Average travel time
-
User Information
- Count of each user type
- Count for each gender for New York and Chicago
- Oldest customer and recent customer for New York and Chicago
- Most common year of birth for New York and Chicago
- Stack Overflow
- Udacity Python Foundation Nanodegree Program
- GitHub