Over the past decade, bicycle-sharing systems have been growing in number and popularity in cities across the world. Bicycle-sharing systems allow users to rent bicycles for short trips, typically 30 minutes or less. Thanks to the rise in information technologies, it is easy for a user of the system to access a dock within the system to unlock or return bicycles. These technologies also provide a wealth of data that can be used to explore how these bike-sharing systems are used.
In this project, we will focus on the record of individual trips taken in 2016 from our selected cities: New York City, Chicago, and Washington, DC. Each of these cities has a page where we can freely download the trip data :
If you visit these pages, you will notice that each city has a different way of delivering its data. Chicago updates with new data twice a year, Washington DC is quarterly, and New York City is monthly.
Table of Contents |
---|
Prerequisites 🔍📜 |
Design 📐 |
Conclusions 📌 |
License 🔖 |
- Python 3.6.3
- Jupyter Notebook
- Anaconda-Navigator
Data is provided by Motivate, a bike-share system provider for many major cities in the United States. I will compare the system usage between three large cities: New York City, Chicago, and Washington, DC;
Compare the system usage between three large cities: New York City, Chicago, and Washington, DC;
Examine if there are any differences within each system for those users that are registered, regular users and those users that are short-term, casual users.
In this project, Python is the main tool used to explore data related to bikeshare systems for three major bikeshare systems in the United States as well as perform data wrangling to unify the format of data from the three systems and write code to compute descriptive statistics. External packages beyond Python library are introduced to help visualizing the data.
We have done quite a lot of profound analysis based on such a limited set of data, however,there are also a lot of potential analyses that could be performed on the data which are not possible with only the data provided. For example, detailed location data has not been investigated. Where are the most commonly used docks? What are the most common routes? As another example, weather has potential to have a large impact on daily ridership. How much is ridership impacted when there is rain or snow? Are subscribers or customers affected more by changes in weather?
We can also apply this technique to use these skills to analyze open data that can be found on the internet. There are all sorts of things such as social media sentiment,sports,finance.