Green Wind is an example of a ETL tool that can extract the information from semi-structed data (JSON) to store it in a Postgres Database. We are then able to connect to the PostgresDB instance with a Jupyter Notebook to explore the data in the notebooks/battery_data_analysis.ipynb
notebook.
Green Wind can run on a Cloud Provider such as AWS, Azure, or GCP and can scale to handle large amounts of data. The endpoints of the application can be extended to display the data in a web application or mobile application. It can also be extend to handle streaming data from a MQTT broker, or Kafka. The application can also be extended to handle unstructed data such as images, pdfs, video, or audio.
Note
The Digital Twin application is not fully implemented, it is a work in progress 👷♂️
This application also looks into creating a Digital Twin of a battery to identify the health of the battery and predict the remaining useful life of the battery. This compares Gaussian Process Regression, Bayesian Neural Networks and Monte Carlo Methods to identify the best model to predict the remaining useful life of the battery. This
is done in the notebooks/battery_data_analysis.ipynb
notebook. The data is extracted from the csv files in data/raw/*
. The source of the Data and Data Science code is from here:
https://github.com/acthelen/battery_digital_twin and the raw files can be found in the digital_twin/
folder.
In a real world application we could stream this data from a MQTT broker into a MySQL instance similar to Zendesk's Solution. We could use a tool like Apache Airflow or Prefect" to schedule a job to extract with the data from the MySQL instance and use Apache Spark to process the data and store it store it in a Data Lake such as AWS S3. Then we could use a tool like AWS Glue to transform the data and store it in a Data Warehouse such as AWS Redshift or Apache Iceberg for large scale analytics.
Here is an example of some unstructered data that we will be working with, in this example you can post this data to the API endpoint localhost:8000/battery_data
{
"battery_id": "B123456789",
"manufacturer": "GreenPowerTech",
"model": "GPT-5000",
"capacity": {
"current_capacity": "4000 kWh",
"design_capacity": "5000 kWh",
"remaining_capacity": "3500 kWh"
},
"voltage": {
"terminal_voltage": "480 V",
"operating_voltage": "450-500 V",
"charging_voltage": "460 V"
},
"temperature": {
"internal_temperature": "35°C",
"external_temperature": "30°C",
"operating_temperature_range": "20-40°C"
},
"current": {
"charging_current": "100 A",
"discharging_current": "95 A",
"operating_current_range": "80-110 A"
},
"status": {
"state_of_charge": "70%",
"health_status": "Good",
"operational_status": "Active",
"faults": [
{
"fault_id": "F123",
"fault_description": "Overtemperature warning",
"fault_timestamp": "2023-12-21T09:00:00Z"
},
{
"fault_id": "F124",
"fault_description": "Low voltage alert",
"fault_timestamp": "2023-12-21T09:00:00Z"
}
]
},
"energy_throughput": {
"total_energy_delivered": "2000 kWh",
"total_energy_charged": "1800 kWh"
},
"maintenance": {
"last_maintenance_date": "2023-12-20",
"next_maintenance_due": "2024-06-20",
"maintenance_actions": [
{
"action_id": "M123",
"action_description": "Replace cooling fan",
"action_status": "Completed"
},
{
"action_id": "M124",
"action_description": "Inspect insulation",
"action_status": "Pending"
}
]
},
"environmental_conditions": {
"ambient_temperature": "28°C",
"humidity": "60%",
"altitude": "500 meters",
"pressure": "1 atm"
},
"operational_parameters": {
"charge_cycles": "150",
"discharge_cycles": "140",
"max_continuous_discharge_rate": "120 A",
"max_continuous_charge_rate": "110 A"
},
"timestamp": "2023-12-21T10:00:00Z"
}
This application will take in this data and store it in a PostgresDB instance in the following schema:
To get started, create a codespace for this repository by clicking this 👇
A selection menu will open allowing you to create a Codespace. After create a Codespace it will open in a web-based version of Visual Studio Code. The dev container is fully configured with software needed for this project along with added development vscode extensions such as Jupyter Notebook, Python, and Docker.
Once the Codespace has been created, open a terminal and run the following command to uv installing the packages and running the FastAPI application:
uv run fastapi dev src/green_wind/main.py
This repository can be used locally on a system running Visual Studio Code and Docker, or in a remote cloud based Codespaces environment as shown in Option 1.
-
Ensure you have Docker installed on your local machine, if not follow the instructions here: https://docs.docker.com/get-docker/
-
Clone the repository to your local machine
git clone https://github.com/Butch78/green-wind.git
-
After you open the root project in Vscode, the following pop-up should appear. Click "Reopen in Container"
If not click the button in the bottom left corner and then select "Reopen in Container" or type into the command prompt at the top an enter the following command >Reopen in container
This will build the Docker container, run the PostgresDB and should install packages using the poetry package manager. if not run the following command in the terminal:
Once the Devcontainer has been created, open a terminal and run the following command to uv installing the packages and running the FastAPI application:
uv run fastapi dev src/green_wind/main.py
Once the application has started The csv files from /data/raw
will automatically be loaded into the PostgresDB instance.
It will also create some fake Battery data. We can then connect to the Jupyter Notebook: notebooks/battery_data_analysis.ipynb
to explore the data with a read only connection to the PostgresDB instance.