This is a project to determine the duration a taxi would take between 2 points in a city, depending on the date, time, pickup and dropoff coordinates. This is a problem which cab aggregators like Uber, Lyft, etc. try to solve for efficient pricing of the trips and determining when a cab would be available to pick up the next customer.
The dataset consisted of 5 independent variables and 1 response variable.
- Independent variables:
- pickup_datetime: Timestamp of pickup
- pickup_x, pickup_y, dropoff_x, dropoff_y: Pickup and Dropoff coordinates
- Response variable:
- duration: Duration of the trip
The data exploration consisted of investigating the concerntration of the trips across the total region, their distribution among different weekdays and times, and any influence due to holidays or seasons.