This project focuses on predicting taxi tip amounts using machine learning models, specifically Decision Tree Regressors, implemented with Scikit-Learn and Snap ML. The dataset comprises information about taxi trips in New York City, and the goal is to create an accurate model to predict tip amounts.
- Python
- Jupyter Notebooks
- NumPy
- Pandas
- Matplotlib
- Scikit-Learn
- Snap ML
- Data analysis and cleaning to prepare the dataset for machine learning.
- Implementation of Decision Tree Regressors with both Scikit-Learn and Snap ML.
- Comparison of training speed and model performance between Scikit-Learn and Snap ML.
- Data preprocessing techniques.
- Implementation of machine learning models for regression tasks.
- Evaluation and comparison of model performance.
- Efficient utilization of Snap ML for accelerated model training.
- Successfully built and trained regression models to predict taxi tip amounts.
- Demonstrated the speedup achieved using Snap ML compared to Scikit-Learn.
This repository serves as a practical example of implementing machine learning models for regression tasks in predicting taxi tip amounts. It showcases the use of both traditional machine learning libraries (Scikit-Learn) and accelerated libraries (Snap ML) to achieve efficient model training. The comparison provides insights into the advantages of utilizing high-performance libraries for machine learning tasks.
Feel free to explore, contribute, and use this project as a reference for similar regression problems!