Delhivery, India's leading integrated logistics and supply chain services provider, seeks to revolutionize commerce by leveraging world-class infrastructure, top-tier logistics operations, and cutting-edge technology. The data team plays a pivotal role in this mission, using advanced analytics to enhance the quality, efficiency, and profitability of Delhivery's operations, setting them apart from competitors.
Delhivery's data team is tasked with understanding and optimizing the vast amounts of data generated by their logistics operations. This includes cleaning, sanitizing, and manipulating data to extract valuable insights and features. Additionally, they are responsible for assisting the data science team in building forecasting models that can accurately predict various aspects of the logistics process, ultimately improving overall efficiency and customer satisfaction.
In this project, we dive deep into Delhivery's data engineering pipelines, working with a dataset that provides insights into trip creation times, route schedules, transportation types, trip details, and more. Our goal is to not only clean and process this data but also to derive meaningful insights and recommendations that can drive strategic decision-making and propel Delhivery towards even greater success in the dynamic logistics landscape.
- Python: Utilized for overall data analysis and manipulation.
- Numpy and Pandas: Employed for efficient data handling, cleaning, and preprocessing.
- Seaborn and Matplotlib.pyplot: Used for data visualization to identify trends and patterns.
- Statistical/Hypothesis Testing: Applied t-tests and chi-2 tests to validate insights and identify significant differences.
- Data Cleaning and Preprocessing: Ensured data quality by handling missing values, outliers, and transforming raw data into a usable format.
- Feature Engineering: Created new features to enhance model accuracy and provide deeper insights.
- Encoding: Converted categorical data into numerical formats for analysis.
- Outliers Handling using IQR Method: Identified and managed outliers to maintain data integrity and improve analysis accuracy.
By combining these technologies and strategies, I was able to extract valuable insights and provide actionable recommendations for operational improvement in a top logistics and supply chain company.
- Delivery Methods: Carting is used for 60% of deliveries, while FTL is used for 40%.
- Trip Types: 80% of trips are intra-state, and 20% are inter-state.
- Data Availability: Only 22 days of data are available.
- Trip Initiation: Slightly fewer trips (585) are initiated on Sundays compared to weekdays (over 650 trips).
- Busiest States: Maharashtra, Karnataka, Haryana, and Tamil Nadu contribute over 50% of the business.
- Feature Correlation: Numeric features in the dataset are highly and positively correlated.
- Time Differences: Significant differences exist between average trip time and average scanning time from start to end.
- Actual vs. Estimated Time: Significant differences are observed between actual trip completion time and estimated time.
- Segment Time Consistency: No significant difference between average actual time and the average sum of each segment's actual time.
- Distance Discrepancies: Significant differences exist between actual trip distances and estimated distances.
- Segment Time Estimates: Significant differences between average estimated trip time and the sum of each segment's estimated time.
- Route Type Dependency: Route type (Carting or FTL) significantly depends on whether the trip is intra-state or inter-state.
- Optimize Route Types:
- For Intra-State Trips: Since Carting is predominantly used, ensure that the carting process is optimized for shorter distances. Implement regular maintenance schedules for the carts and provide driver training to enhance efficiency.
- For Inter-State Trips: FTL should be optimized by planning routes that minimize travel time and maximize truckload utilization. Use data-driven insights to predict and avoid high-traffic periods.
- Enhance Forecasting Models:
- Utilize the highly correlated numeric features to build robust forecasting models that can predict delivery times and distances more accurately. This can help in better planning and resource allocation.
- Improve Sunday Operations:
- Since fewer trips are initiated on Sundays, consider offering incentives for drivers to work on Sundays or implementing dynamic pricing to attract more customers.
- Focus on Busiest States:
- Allocate more resources to Maharashtra, Karnataka, Haryana, and Tamil Nadu to handle the high volume of business. This could include additional vehicles, drivers, and support staff to maintain service quality.
- Address Time and Distance Discrepancies:
- Investigate the reasons behind the significant differences between actual times/distances and the estimates provided by the open-source routing tool. This could involve updating the routing algorithms with more accurate real-time traffic data and local road conditions.
- Data Collection and Analysis:
- Expand the data collection period beyond 22 days to get a more comprehensive view of operations. This will help in identifying long-term trends and making more informed decisions.
- Segment Analysis:
- Since there is no significant difference between average actual time and the average sum of each segment's actual time, ensure that the segment data is accurately recorded and used for micro-optimizations within each trip.
- Regular Reviews and Updates:
- Continuously review and update the routing algorithms and models to reflect the latest data and trends. This will help in maintaining the accuracy and efficiency of the delivery process.
- Training and Development:
- Provide regular training to drivers and logistics personnel on using new tools and technologies, as well as best practices in logistics and route management.
- Customer Feedback:
- Collect and analyze customer feedback regarding delivery times and route efficiency. Use this data to make customer-centric improvements in the logistics operations.
These recommendations, when implemented, can help improve logistics operations, enhance customer satisfaction, and maintain a competitive edge in the market.