- Title: Philippine Major Cities Weather Data ☀️
- Source: Kaggle
-
Checking for Null Values:
- Ensure there are no missing values in the dataset.
-
Getting Data from CAMANAVA:
- Extract data specifically for the cities of Caloocan, Malabon, Navotas, and Valenzuela.
-
Setting Datetime as Index:
- Convert the
datetime
column to the index of the DataFrame.
- Convert the
-
Encoding Categorical Data:
- Use one-hot encoding for the
weather.id
column to handle categorical data.
- Use one-hot encoding for the
-
Dependent Variable:
- Temperature:
main.temp
- Temperature:
-
Independent Variables:
- Atmospheric Pressure:
main.pressure
(hPa, on the sea level) - Humidity:
main.humidity
(%) - Cloudiness:
clouds.all
(%) - Weather Condition:
weather.id
- Wind Speed:
wind.speed
- Atmospheric Pressure:
- Split the dataset into training and testing sets, preserving temporal data, with an 80/20 ratio.
- Initialize and train the Linear Regression model using the training dataset.
-
Testing on the Test Set:
- Evaluate the model performance using the test dataset.
-
Evaluation Metrics:
- Mean Squared Error (MSE):
- Measures the average squared difference between the predicted values and the actual (observed) values.
- Mean Absolute Error (MAE):
- Measures the average absolute difference between predicted and actual values, providing a straightforward measure of prediction accuracy.
- Root Mean Squared Error (RMSE):
- Measures the square root of the average squared difference between predicted and actual values, providing a comparable metric in the original scale of the data.
- Mean Squared Error (MSE):