Skip to content

Analyzing ridesharing data for PyBer and graphing data for different region types using the matplotlib library, pandas and Python. Created line, bar, scatter, bubble, pie, and box-and-whisker plots using Matplotlib. Determine mean, median, and mode using Pandas, NumPy, and SciPy statistics.

Notifications You must be signed in to change notification settings

tjavaheripour/PyBer_Analysis

Repository files navigation

PyBer_Analysis

Overview

In this project, we used Panda and Python to create a summary DataFrame of the ride-sharing data by city type ( Urban, Suburban, or Rural ). Then, using Panda and Matplotlib, we created a multiple-line graph that shows the total fares for each week by city type from January to early May of 2019. This analysis could help PyBer teams to make a decision.

Results

The summary shows that cities with large populations has a high demand for rides so they need more drivers and as a result total fares are higher. It also illustrates larger demand for PyBer among riders in urban cities compared to suburban and rural cities. On the other hand, average fare per ride and average fare per driver are lower for the more populous areas.

PyBer Summary DataFrame

summary dataframe.png

As a result, it is a better market for drivers in rural city because they earn about 3.5 times as much as those in the urban area do and about 1.5 times as those in suburban. On the other hand though, because of lower number of rides on average in less populated cities, rides are fairly expensive for travelers.

Multiple Line Chart

As shown in the multiple line graph, urban cities have the highest fares and rural cities have the lowest volume throughout this period. Moreover, it is clear that all cities have a high peak around the third week of February.

PyBer_fare_summary.png

Summary

Based on the result from both summary dataframe and multiple line plots of weekly total fare by city type, I would recommend to the CEO the followings:

  1. We can obviously see that the ratio of total rides to total drivers is not proportionate across three city types. This rate is 1.6, 1.3 and 0.7 for rural, suburban and urban areas respectively which means there is more need for drivers in less populated areas. My recommendation is to stop hiring drivers in cities and to acquire them in rural side instead.
  2. As we shown above the ride fares in rural areas are higher hence company should direct its marketing programs towards incentivizing drivers to work in those areas.
  3. Company should focus on absorbing more customers on the rural and suburban areas as the demand seem to be higher.

About

Analyzing ridesharing data for PyBer and graphing data for different region types using the matplotlib library, pandas and Python. Created line, bar, scatter, bubble, pie, and box-and-whisker plots using Matplotlib. Determine mean, median, and mode using Pandas, NumPy, and SciPy statistics.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published