In response to economic conditions and environmental concerns, fuel economy of cars has had varying levels of importance to consumers. This project investigates changes in fuel economy and relationships between fuel economy and other car attributes. It will present exploratory analysis by analysing U.S. automobile fuel efficiency over time for single fuel gasoline cars.
Currently, most motor vehicles worldwide are powered by gasoline or diesel. Other energy sources include ethanol, biodiesel, propane, compressed natural gas (CNG), electric batteries charged from an external source, and hydrogen. So, it is vital to understand how the automobile fuel efficiency has improved over time. Are there any other interesting insights or trends?
Once the data is collected, cleaned, and processed, it is ready for Analysis. During this phase, data analysis tools and software are used which will help in understanding, interpreting, and deriving conclusions based on the requirements. I used python to do all the analysis of this project.
Gathering the data The Environmental Protection Agency data is collected from the U.S. Department of Energy’s Fuel Economy Data. The data is stored in the vehicles data set that contains fuel efficiency performance metrics, measured in miles per gallon (MPG) over time. The data contain information gathered for over 39 years starting from the year 1984 to the year 2022. Over the years,Regular Gasoline has been the most used primary fuel type followed by Premium Gasoline and Natural Gas was the least used.
Metadata or data dictionary: https://www.fueleconomy.gov/feg/ws/index.shtml#vehicle
- Pandas (for data loading and analysis)
- NumPy (for computing)
- Matplotlib (for visualizations)
- Seaborn (for visualizations)
- Jupyter (to run notebooks)
In this project, I will answer various questions related to various features of vehicles dataset and their correlation with Ucity( miles per gallon)
- Which fuel types are commonly used in the automobiles?
- Count of cars with automatic and manual transmission?
- Most frequent brands of car used?
- Most common class type of cars used?
- Most common fuel type used?
- Most common wheeldrive used?
- Unique model counts increased over the years?
- Which cars have the highest Ucity average?
- Comparison of Ucity for fuel type1 and fueltype2?
- Which engine size type was most popular over the years?
- How does the size of a car engine affect its fuel consumption?
In the project, you will also see
Cross-validation is a resampling procedure used to evaluate machine learning models on a limited data sample. Used 10-fold Cross-Validation to train these models. Models to evaluate are
- Random Forest Model
- Decision tree Model
I hope you will find this interesting and helpful.