Analysis and visualization of synthetic weather data generated for ten different locations, including New York, Los Angeles, Chicago, Houston, Phoenix, Philadelphia, San Antonio, San Diego, Dallas, and San Jose. The data includes information about temperature, humidity, precipitation, and wind speed, with 1,000,000(1 million) data points generated for each parameter.
The project uses machine learning models to predict temperature based on other weather features. The modeling process involves:
- Splitting data into training and test sets.
- Training models (e.g., Linear Regression, Random Forest).
- Evaluating model performance using metrics such as MAE, MSE, and RMSE.
Visualizations are created to explore and understand the data better. Plots include:
- Temperature trends over time.
- Comparison of weather parameters across different locations.
- Correlation Matrix of Weather Parameters.
The project provides insights into temperature trends and patterns in various locations. The predictive models show reasonable accuracy in forecasting temperatures. Visualizations help in understanding the data and the model's performance.
Key findings include:
- Seasonal temperature variations.
- Relationships between temperature and other weather parameters.
- Distribution of wind speed.