Skip to content

tobi-soboyejo/eda1-project_indian_energy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Exploratory Data Analysis Project 1 (Indian Energy Production)

Introduction

"India is the world's third-largest producer and third largest consumer of electricity. The national electric grid in India has an installed capacity of 370.106 GW as of 31 March 2020. Renewable power plants, which also include large hydroelectric plants, constitute 35.86% of India's total installed capacity. India has a surplus power generation capacity but lacks adequate distribution infrastructure.ndia's electricity sector is dominated by fossil fuels, in particular coal, which during the 2018-19 fiscal year produced about three-quarters of the country's electricity.The government is making efforts to increase investment in renewable energy. The government's National Electricity Plan of 2018 states that the country does not need more non-renewable power plants in the utility sector until 2027, with the commissioning of 50,025 MW coal-based power plants under construction and addition of 275,000 MW total renewable power capacity after the retirement of nearly 48,000 MW old coal-fired plants."

Data Sourcing/Credits

The dataset used for this project can be found on Kaggle, "Daily Power Generation in Indian (2017 - 2020)". This dataset was created by users "Navin" and "Twinkle Khanna".

Objectives:

  • Perform exploratory data analysis on the data set to find key insights.
  • Perform data cleaning and data structuring to ensure data set is in good condition to work with.
  • Create appealing graphs and images using Python that highlight findings and point out trends and patterns.

Data Description

  • Date: date in YYYY-MM-DD format.
  • Thermal Generation Actual (in MU): amount of actual thermal energy generated, measured in MU (gigawatt-hours).
  • Thermal Generation Estimated (in MU): amount of expected/estimated thermal energy generated, measured in MU.
  • Nuclear Generation Actual (in MU): amount of actual nuclear energy generated, measure in MU.
  • Nuclear Generation Estimated (in MU): amount of expected/estimated nuclear energy generated, measured in MU.
  • Hydro Generation Actual (in MU): amount of actual hydro-electrical energy generated, measured in MU.
  • Hydro Generation Estimated (in MU): amount of expected/estimated hydro-electrical energy generated, measured in MU.
  • Region: which region of Indian (Nother, Western, Southern, Eastern, and NorthEastern).

Exploratory Data Analysis

  • This plot shows energy production over time for each energy type after data has been resampled from a daily format to a monthly format, making the plot easier to read compared to the original.
    • We can clearly see that 'Thermal' energy is the most widely produced, with 'Hydro' coming in second. This aligns with the historical data given with the data set. We also notice that 'Nuclear' energy is used 20x less than 'Thermal'.

(Plot 2) Time Series Resampled Data

  • This plot visualizes the relationship between energy production and region.
    • We notice that the 'Western' region is by far the largest producer of 'Thermal' energy, whereas the 'Northern' region is the highest producer of 'Hydro' energy. We also notice that the 'Southern' region is the leader in terms of 'Nuclear' energy production and that both the 'Easter' and 'NorthEastern' regions do not produce any 'Nuclear' energy at all.

(Plot 3) Power Generation by Region

  • The plot below visualizes actual energy production from September 2017 to August 2020.

(Plot 4) Total Energy Production Over Time

  • The following plot visulizes actual 'Thermal' energy production by region from September 2017 to August 2020.

(Plot 5) Thermal Production Over Time

  • The plot below visulizes actual 'Nuclear' energy production by region from September 2017 to August 2020.

(Plot 6) Nuclear Production Over Time

  • This plot visulizes actual 'Hydro' energy production by region from September 2017 to August 2020.
    • In this plot we notice a clear seasonal trend in energy production most likely do to weather conditions that may impact water amounts (rain seasons, droughts, etc.).

(Plot 7) Hydro Production Overtime

  • From the second data set that contains 'State_Region' data we created a pie chart that visualizes national share of energy production by region.
    • This graph restates the findings from the above plots.

(Plot 8) Power Generation by Region

  • Also from the second data set we created a pie chart that visualizes energy production by state.

(Plot 9) Energy Production by State