The objective of this project is to predict the fuel efficiency of vehicles (MPG) based on other information about the vehicles. To do this, I used a historical continuous data on MPG based on the fuel efficiency of each vehicle from the 70s to the 80s.
In order to accomplish this, I need to create an end-to-end supervised machine learning pipeline . Once the pipeline is designed and implemented, it will be submitted to the company's lead data scientist for prediction purposes.
Here are the steps I will take to build my pipeline:
- Data Collection: I will use the Auto MPG dataset obtained from the UCI ML Repository.
- Data Exploration:This will be done to identify the most important features and combine them in new ways.
- Data Preprocessing: Lay out a pipeline of tasks for transforming data for use in my machine learning model.
- Model selection & Hyperparameter Tuning : Cross-validate a few models and fine-tune hyperparameters for models that showed promising predictions.
- Model Assessment: Determine the performance of the final trained model.
- A feature importance analysis
- Conclusion & recommendations
Download the dataset from http://archive.ics.uci.edu/ml/datasets/Auto+MPG
I used Jupyterhub for my solution, you can download Fuel prediciton consumption _ machine learning from my repository and try it.