Skip to content

29akeshvimt/Web-Scrapping-Project

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Car Data Analysis(Web Scrapping Project)

This project involves analyzing car data obtained from web scraping. The dataset includes information on car models, their prices, transmission types, and kilometers driven. The analysis includes handling missing data, detecting and removing outliers, and visualizing various aspects of the data.

Setup and Dependencies

To run the analysis, you will need the following Python libraries:

pandas numpy matplotlib seaborn

Data Analysis Steps

1. Load Data

The dataset is loaded from a CSV file named "Final File.csv" into a pandas DataFrame.

2. Handling Missing Data

The code checks for missing values in the DataFrame to ensure data completeness.

3. Column-wise Maximum Values

The maximum values for each column are calculated to understand the range of data.

4. Extract Specific Columns

Relevant columns are extracted from the DataFrame for focused analysis.

5. Detecting and Removing Outliers

Outliers in the 'Kilometers' column are detected using Z-scores. A Z-score threshold of 3 is used to identify outliers, which are then removed from the DataFrame.

6. Visualization

Various visualizations are created to gain insights from the data:

Bar Plot of Car Prices: A bar plot is created to show the prices of different car models. The x-axis represents the car models, and the y-axis represents their prices in lakhs.

Stacked Bar Chart of Transmission Distribution: The distribution of transmission types (manual or automatic) across different car models is visualized using a stacked bar chart. This helps in understanding the prevalence of each transmission type for different car models.

Scatter Plot of Car Prices: A scatter plot is created to show the relationship between car models and their prices. This plot helps in identifying any patterns or trends in pricing across different models.

7. Calculating Averages

The average price and average kilometers driven for the cars in the dataset are calculated and printed. This provides a summary statistic that gives an overall view of the data.

Conclusion

This project demonstrates the process of cleaning and analyzing car data obtained from web scraping. By handling missing data, removing outliers, and visualizing the data, valuable insights can be gained. This analysis can be extended further to include more sophisticated models and predictions based on the car data.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%