This repository contains a Jupyter Notebook for performing exploratory data analysis (EDA) on the Emergy Metabolism dataset, focusing on 281 cities in China from 2000 to 2020. The analysis provides insights into energy, material, and population flows within these cities, with visualizations to help understand trends and correlations.
- Python 3.8 or higher
- Google Colab (Cloud environment setup)
- Internet connection for installing dependencies
To run the analysis locally, follow these steps:
-
Install Python (version 3.x). You can download it from python.org.
-
Install Jupyter Notebook via
pip
:pip install notebook
-
Install other required libraries:
pip install pandas numpy matplotlib seaborn openpyxl
-
Clone this repository to your local machine:
git clone https://github.com/your_username/your_repository.git
-
Navigate to the repository:
cd your_repository
-
Start the Jupyter Notebook server:
jupyter notebook
-
The Jupyter interface will open in your web browser. Navigate to the
code/EDA_on_Emergy_Metabolism_Corrected.ipynb
notebook and run the cells. -
If necessary, adjust the path to the dataset (
data/Emergy flows of 281 China's cities 2000-2020.xlsx
) in the notebook to ensure it matches the relative path from thecode
folder:data_file_path = "../data/Emergy flows of 281 China's cities 2000-2020.xlsx"
- Navigate to the repository on GitHub.
- Click the Code button and select Open with Codespaces. If you haven't set up a codespace yet, you will need to create one.
- Once the Codespace is ready, it will open in a cloud-based IDE where you can directly run the notebook.
- Navigate to Google Colab.
- Select GitHub from the Open menu and enter the URL of this repository.
- Select the notebook (
EDA_on_Emergy_Metabolism_Corrected.ipynb
). - In Colab, ensure the dataset is properly linked:
- Either upload the dataset manually or link the
data
folder from your GitHub repository. - Make sure the code to load the dataset uses the correct relative path:
data_file_path = "../data/Emergy flows of 281 China's cities 2000-2020.xlsx"
- Either upload the dataset manually or link the
The dataset used in this analysis is the Emergy Metabolism dataset, which provides data on the energy, material, and population flows for 281 cities in China from 2000 to 2020. The dataset is located in the data
folder of the repository.
your_repository/
├── code/
│ └── EDA_on_Emergy_Metabolism_Corrected.ipynb # Jupyter Notebook for EDA
├── data/
│ └── Emergy flows of 281 China's cities 2000-2020.xlsx # Dataset file
└── README.md # This file
- File Path Errors: If the notebook cannot find the dataset, ensure the relative file path is correct. If you're using a cloud environment, make sure the dataset is either uploaded to the environment or properly linked to the repository.
- Missing Dependencies: If you encounter any missing libraries or modules, you can install them using
pip
, as described in the Local Environment Setup section.
This repository provides a clear methodology for performing exploratory data analysis (EDA) on the Emergy Metabolism dataset. By following the setup instructions for your chosen environment, you should be able to run the analysis and explore the insights from the data.
For further questions or issues, please contact the repository maintainers or open an issue on GitHub.