Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,10 @@
All notable changes to the [mlfmu] project will be documented in this file.<br>
The changelog format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).

## [unreleased]

* Added python code examples, and instruction in README, of how to create the .onnx models for wind_generator and wind_to_power.

## [1.0.3]

### Changed
Expand Down
86 changes: 86 additions & 0 deletions examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,89 @@ For further documentation of the `mlfmu` tool, see [README.md](../README.md) or
<!-- Markdown link & img dfn's -->
[FMU_check]: https://fmu-check.herokuapp.com/
[docs]: https://dnv-opensource.github.io/mlfmu/

## Creating the wind_generator and wind_to_power ML models, before converting to FMU

We have included the python code that was used to create the onnx models, which are then converted to FMUs. You do not need to do this yourself, since we included the onnx models in the `config` folder for each example. However, it can be insightful, in particular if you want to understand better how we implemented the ML model wrapper so that it works well for conversion to FMU.

If you want to train the models yourself and test everything, bottom up, here are the steps we took for training the models.
The below steps assume you work on Windows and use the specific versions of software specified in the requirements.txt file, we have not tested this for other systems (as this is a mere, simple example).

These models are using the publicly available ["Wind Turbine Scada Dataset"](https://www.kaggle.com/datasets/berkerisen/wind-turbine-scada-dataset/data) from Kaggle, which you should download into the `examples\data` folder.

### Install: Conda with pip (Windows native example)

Note that Windows native only supports up to TensorFlow 2.10. Newer versions can only be installed with WSL2.
This instruction is for using TF 2.10 with Windows native and pip.

1. Install conda, e.g. miniconda: <https://docs.anaconda.com/miniconda/>

2. Create a conda (or other virtual) environment, with some required packages:

* From the `examples` directory:

```sh
# We need an older version of Python (for TensorFlow 2.10):
conda create -n mlfmu-examples python=3.10
```

* Install Tensorflow / GPU support:

```sh
conda activate mlfmu-examples
conda install -c conda-forge cudatoolkit=11.2 cudnn=8.1.0
# note: you need TensorFlow 2.10 and NumPy < 2 (TensorFlow 2.10 cannot handle NumPy 2.0)
python -m pip install "tensorflow==2.10.0" "numpy==1.23.5"
# test your install
python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
```

* This should return an array of GPU devices, e.g.

```sh
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
```

3. Now we want to install more packages needed for the windpower demo:

> Note: if you did not specify the python version when you created your conda environment (which we recommend doing!), you will need to run `conda install pip` before running the next command.

```sh
# run `conda activate mlfmu-examples`, if you did not already do so
pip install -r requirements.txt
```

It is expected that you see an error about requirements for different protobuf versions, which you can safely ignore for this folder.

4. Because of some conflicts with package versions and their requirements, you may need to upgrade tf2onnx now:

```sh
pip install -U tf2onnx
```

### ML Model creation instructions

For these examples, we use the publicly available ["Wind Turbine Scada Dataset"](https://www.kaggle.com/datasets/berkerisen/wind-turbine-scada-dataset/data) from Kaggle. Download the dataset from [https://www.kaggle.com/datasets/berkerisen/wind-turbine-scada-dataset/data](https://www.kaggle.com/datasets/berkerisen/wind-turbine-scada-dataset/data), and store as `T1.csv` into the `data` folder.

1. Make sure you are in your virtual environment, either ```conda activate mlfmu-examples```:

2. To generate the FMUs, firstly run the notebooks to train the ML models and save them by running ```training_models.py```. Go into the directory of the specific model you are interested in (e.g. `wind_to_power\ml_model`):

```sh
python train_model.py
```

This should create a (new) folder called `trained_model` with:

* wind_generator: wind_generator_interpolated (keras model) and wind_generator_interpolated.hs (model weights)
* wind_to_power: power (keras model) and power.h5 (model weights)

3. Now you can create onnx files for either the power predictor (power.onnx) or the wind generator (wind.onnx), from the respective folders:

```sh
python power_to_onnx.py
python generator_to_onnx.py
```

The resulting `.onnx` files will be stored in the `trained_model` folder.
These onnx files can then be used with the mlfmu tool to create FMUs to run in STC.
5 changes: 5 additions & 0 deletions examples/data/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Dataset for wind_generator and wind_to_power examples

For this example, we use the publicly available ["Wind Turbine Scada Dataset"](https://www.kaggle.com/datasets/berkerisen/wind-turbine-scada-dataset/data) from Kaggle.

Go to [https://www.kaggle.com/datasets/berkerisen/wind-turbine-scada-dataset/data](https://www.kaggle.com/datasets/berkerisen/wind-turbine-scada-dataset/data) and download the dataset as `T1.csv` into this folder.
Binary file added examples/requirements.txt
Binary file not shown.
87 changes: 87 additions & 0 deletions examples/utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
import numpy as np
import scipy.interpolate
from numpy.lib.stride_tricks import as_strided


def normalize_data(data: np.ndarray) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
"""
Normalize the data by subtracting the mean and dividing by the standard deviation.

Parameters:
data (numpy.ndarray): The input data to be normalized.

Returns:
numpy.ndarray: The normalized data.
numpy.ndarray: The mean of the input data.
numpy.ndarray: The standard deviation of the input data.
"""
mean = np.mean(data, axis=0, keepdims=True)
std = np.std(data, axis=0, keepdims=True)

norm = (data - mean) / std
return norm, mean, std


def noisy_interpolation(
data: np.ndarray,
upsampling: int = 10,
spline_order: int = 3,
noise_to_signal_ratio: float = 0.1,
noise_window_length: int = 20,
) -> np.ndarray:
"""
Upsample data using spline interpolation with added noise scaled to the standard deviation of the data.

Parameters:
data (ndarray): The input data to be interpolated.
upsampling (int, optional): The factor by which to upsample the data. Default is 10.
spline_order (int, optional): The order of the spline interpolation. Default is 3.
noise_to_signal_ratio (float, optional): The ratio of noise to signal. Default is 0.1.
noise_window_length (int, optional): The length of the window used to calculate the standard
deviation of the data. Default is 20.

Returns:
ndarray: The interpolated data with added noise.
"""

n, m = data.shape

# Interpolate the data with a spline
t = np.arange(n)
spline = scipy.interpolate.make_interp_spline(t, data, axis=0, k=spline_order) # type: ignore # noqa: PGH003

t_up = np.arange(n * upsampling) / upsampling
interpolated_data = spline(t_up)

# Add noise to the interpolation
# Making windows of the data to calculate the windowed standard deviation
strided_data = []
for i in range(m):
data_i = data[:, i]
strided_data_i = as_strided(
data_i,
shape=(n - noise_window_length + 1, noise_window_length),
strides=(data_i.strides[0], data_i.strides[0]),
)
strided_data.append(strided_data_i[:, :, np.newaxis]) # type: ignore # noqa: PGH003

strided_data = np.concatenate(strided_data, axis=-1)

# Calculate the windowed standard deviation of the windows of the data
windowed_std = np.std(strided_data, axis=1)
windowed_std = np.pad(
windowed_std, ((noise_window_length // 2, n - windowed_std.shape[0] - noise_window_length // 2), (0, 0))
)

# Interpolate the standard deviation of the windows with the same upscaled time as the data
std_spline = scipy.interpolate.make_interp_spline(t, windowed_std, axis=0, k=1) # type: ignore # noqa: PGH003
interpolated_std = std_spline(t_up)

# Sample noise with the same shape as the interpolated data and scale it with the interpolated standard deviation
white_noise = np.random.normal(0, 1, interpolated_std.shape) # type: ignore # noqa: NPY002, PGH003
noise_std = np.sqrt(noise_to_signal_ratio) / upsampling * interpolated_std

scaled_noise = white_noise * noise_std

# Add the noise to the interpolated data
return interpolated_data + scaled_noise
120 changes: 120 additions & 0 deletions examples/wind_generator/ml_model/generator_to_onnx.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
# %%
# Imports
from pathlib import Path

import numpy as np
import onnx
import onnxruntime as rt
import tensorflow as tf
import tf2onnx

# %% Defining Paths
# This assumes that you run this file from ML-models/windpower/
PATH_TO_CURRENT_FOLDER = Path().absolute()
MODEL_FOLDER = Path(PATH_TO_CURRENT_FOLDER, "trained_model")

saved_model_path = Path(MODEL_FOLDER, "wind_generator_wrapped")
onnx_model_path = Path(MODEL_FOLDER, "wind.onnx")

# %%
# Loading the saved keras Model
print(f"loading model from: {saved_model_path}")
keras_model = tf.keras.models.load_model(saved_model_path, compile=False)

# %%
# Converting to onnx model
# state length: 130 = 2 (wind) + 2*64 (lstm states)
input_signature = [
tf.TensorSpec((None, 2), name="inputs"),
tf.TensorSpec((None, 130), name="state"),
tf.TensorSpec((None, 2), name="time"),
]
onnx_model, _ = tf2onnx.convert.from_keras(keras_model, input_signature)

# %%
# Saving onnx model
onnx.save(onnx_model, onnx_model_path)

# %%
# Loading and testing saved model to see if it works
loaded_onnx_model = rt.InferenceSession(onnx_model_path)

all_inputs = loaded_onnx_model.get_inputs()
all_outputs = loaded_onnx_model.get_outputs()

input_names = [inp.name for inp in all_inputs]
input_shapes = [inp.shape for inp in all_inputs]

output_names = [out.name for out in all_outputs]
output_shapes = [out.shape for out in all_outputs]

## Testing

# create test input data
nr_test_inputs = 100

# split up inputs into its 3 elements
inputs, state, time = all_inputs

# create wind model to be able to define initial state / inputs
wind_generator_state = tf.zeros((1, *input_signature[1].shape[1:]))

# same dt as in training_models.py
dt = 600
upsampling = 10
dt_up = dt / upsampling
time = tf.constant([[0.0, dt_up]])

# iteratively make predictions (prev state is used as input for next prediction)
wind_generator_state_keras = wind_generator_state
wind_generator_state_onnx = wind_generator_state
time_series_keras = [wind_generator_state_keras[:, :2]]
time_series_onnx = [wind_generator_state_onnx[:, :2]]
for i in range(1, nr_test_inputs):
# random noise input
noise_input = tf.random.normal((1, 2))
# if no prev prediction, initialize the state
initial_inputs = [noise_input, wind_generator_state, time]
# predict with keras
wind_generator_state_keras = keras_model(initial_inputs)
time_series_keras.append(wind_generator_state_keras[:, :2])
# predict with onnx
# onnx file needs the input in (x,) format, so reshape tensors
initial_inputs = [
tf.reshape(tf.transpose(noise_input), (2,)),
tf.reshape(tf.transpose(wind_generator_state), (130,)),
tf.reshape(tf.transpose(time), (2,)),
]
onnx_inputs = {nm: [i] for nm, i in zip(input_names, initial_inputs, strict=False)}
wind_generator_state_onnx = loaded_onnx_model.run(output_names=output_names, input_feed=onnx_inputs)
# reshape onnx output to tensor (otherwise np array)
wind_generator_state_onnx = tf.reshape(wind_generator_state_onnx, (1, 130))
time_series_onnx.append(wind_generator_state_onnx[:, :2])

# model output data
keras_time_series = tf.concat(time_series_keras, axis=0)
onnx_time_series = tf.concat(time_series_onnx, axis=0)
# this results in tensors of shape (nr_exp,2)
# so we can separate out wind speed and wind direction:
# 0 = wind speed
keras_wind_speed_estimate = keras_time_series[:, 0]
onnx_wind_speed_estimate = onnx_time_series[:, 0]
# 1 = wind direction
keras_wind_direction_estimate = keras_time_series[:, 1]
onnx_wind_direction_estimate = onnx_time_series[:, 1]

error = keras_wind_speed_estimate - onnx_wind_speed_estimate
mae = tf.math.reduce_mean(tf.abs(error), axis=0)
print(f"MAE between keras and onnx model wind_speed_estimate outputs: {mae}")

error = keras_wind_direction_estimate - onnx_wind_direction_estimate
mae = tf.math.reduce_mean(tf.abs(error), axis=0)
print(f"MAE between keras and onnx model wind_direction_estimate outputs: {mae}")

# assert_allclose: Given two array_like objects,
# check that their shapes and all elements are equal..
# An exception is raised if the shapes mismatch or any values conflict.
np.testing.assert_allclose(keras_wind_speed_estimate, onnx_wind_speed_estimate, rtol=1e-3, atol=1e-3)
print("Sanity check: Results of keras and onnx model predictions for wind speed are the same (rtol 1e-3)!")
np.testing.assert_allclose(keras_wind_direction_estimate, onnx_wind_direction_estimate, rtol=1e-2, atol=1e-2)
print("Sanity check: Results of keras and onnx model predictions for wind direction are the same (rtol 1e-2)!")
Loading
Loading