This project is part of the Udacity data analysis nanodegree.
In this project, I am exploring data related to bike share systems for three major cities in the United States; Chicago, New York City, and Washington. After taking input from the user, this script answers interesting questions about the data by computing descriptive statistics using pandas library.
Language: Python 3.7 or above
Supported OS: Linux
Use one of the following commands in terminal after navigating to the project's directory to install the project requirements.
conda env create -f environment.yml
or
pip install -r requirements.txt
Run the commands below from terminal after navigating to the project directory.
conda activate bikeshare
python bikeshare.py
The script uses bullet library to take input from the user. The user must choose one of the three aforementioned cities. Afterwards, the user is asked to choose the filters based on which the statistics are computed.
Available filters:
- Month: filter by a specific month only
- Day: filter by a specific day of the week only
- Both: filter by a specific month and day of the week
- None: no filters
The user is then prompted to choose the month, day or both based on the filter choice.
Station statistics:
- Most used start station
- Most used end station
- Most used combination of start and end stations
Trip duration statistics:
- Total trip duration
- Average trip duration
User statistics:
- Subscribers vs. customers distribution
- Gender distribution
- Earliest year of birth, most recent year of birth and most common year of birth
The user is prompted if he/she wishes to view individual raw trip data. If the user inputs "yes", the data of 5 trips will be presented in raw format.
The same prompt is repeated until the user inputs "no". The user is finally prompted if he/she wishes to restart the exploration.
- Add visualizations using plotly or termplotlib