Skip to content

Commit ca2a5f5

Browse files
authored
Merge pull request #25 from openscm/error-type
Add Fortran ErrorV and associated wrapping code
2 parents 83b9f74 + e63704f commit ca2a5f5

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

44 files changed

+1833
-492
lines changed

.github/workflows/ci.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -113,6 +113,12 @@ jobs:
113113
# TODO: figure out whether we need/want to use other compilers too
114114
compiler: "gcc"
115115
version: "13"
116+
- name: Set DLL directory environment variable
117+
# Used when running the tests, see `tests/conftest.py` for details
118+
if: ${{ startsWith(matrix.os, 'windows') }}
119+
run: |
120+
echo "PYTHON_ADD_DLL_DIRECTORY=C:\\mingw64\\bin" >> $GITHUB_ENV
121+
116122
- uses: ./.github/actions/setup
117123
with:
118124
python-version: ${{ matrix.python-version }}

.github/workflows/release.yaml

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,6 @@ jobs:
2828
pip install -r {project}/requirements-only-tests-min-locked.txt
2929
CIBW_TEST_COMMAND: >-
3030
pytest {project}/tests
31-
MACOSX_DEPLOYMENT_TARGET: "13.0"
3231
strategy:
3332
fail-fast: false
3433
matrix:
@@ -57,6 +56,14 @@ jobs:
5756
# TODO: figure out whether we need/want to use other compilers too
5857
compiler: "gcc"
5958
version: "13"
59+
- name: Set deployment target - MacOS 13.0
60+
if: ${{ matrix.os == 'macos-13' }}
61+
run: |
62+
echo "MACOSX_DEPLOYMENT_TARGET=13.0" >> $GITHUB_ENV
63+
- name: Set deployment target - MacOS 14.0
64+
if: ${{ matrix.os == 'macos-14' }}
65+
run: |
66+
echo "MACOSX_DEPLOYMENT_TARGET=14.0" >> $GITHUB_ENV
6067
- name: Specify LLVM - windows-arm
6168
if: ${{ matrix.os == 'windows-11-arm' }}
6269
shell: pwsh

Makefile

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -49,8 +49,7 @@ test: ## run the tests (re-installs the package every time so you might want to
4949
# because it is looking for lines in `src` to be run,
5050
# but they're not because lines in `.venv` are run instead.
5151
# We don't have a solution to this yet.
52-
#
53-
# Coverage directory - needed to trick code cov to looking at the right place
52+
uv run --no-sync python scripts/inject-srcs-into-meson-build.py
5453
uv run --no-sync python -c 'from pathlib import Path; import example_fgen_basic' || ( echo "Run make virtual-environment first" && false )
5554
COV_DIR=$$(uv run --no-sync python -c 'from pathlib import Path; import example_fgen_basic; print(Path(example_fgen_basic.__file__).parent)'); \
5655
uv run --no-editable --reinstall-package example-fgen-basic pytest -r a -v tests src --doctest-modules --doctest-report ndiff --cov=$$COV_DIR

changelog/25.feature.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Added demonstration of wrapping Fortran derived types

docs/NAVIGATION.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,11 @@ See https://oprypin.github.io/mkdocs-literate-nav/
66
- [Home](index.md)
77
- [Installation](installation.md)
88
- [How-to guides](how-to-guides/index.md)
9-
- [Do a basic calculation](how-to-guides/basic-calculation.md)
10-
- [Run code in a notebook](how-to-guides/run-code-in-a-notebook.py)
9+
- [Basic demo](how-to-guides/basic-demo.py)
1110
- [Tutorials](tutorials/index.md)
1211
- [Further background](further-background/index.md)
1312
- [Dependency pinning and testing](further-background/dependency-pinning-and-testing.md)
13+
- [Wrapping Fortran derived types](further-background/wrapping-fortran-derived-types.md)
1414
- [Development](development.md)
1515
- [API reference](api/example_fgen_basic/)
1616
- [Fortran API](fortran-api/home.html)
Lines changed: 281 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,281 @@
1+
# Wrapping derived types
2+
3+
Here we describe our approach to wrapping Fortran derived types.
4+
5+
## What is the goal?
6+
7+
The goal is to be able to run MAGICC, a model written in Fortran, from Python.
8+
This means we need to be able to instantiate MAGICC's inputs in memory in Python,
9+
pass them to Fortran to solve the model and get them back as results in Python.
10+
11+
Our data is not easily represented as primitive types (floats, ints, strings, arrays)
12+
because we want to have more robust data handling, e.g. attaching units to arrays.
13+
As a result, we need to pass objects to Fortran and return Fortran derived types to Python.
14+
It turns out that wrapping derived types is tricky.
15+
Notably, [f2py](https://numpy.org/doc/stable/f2py/)
16+
does not provide direct support for it.
17+
As a result, we need to come up with our own solution.
18+
19+
## Our solution
20+
21+
Our solution is based on a key simplifying assumption.
22+
Once we have passed data across the Python-Fortran interface,
23+
there is no way to modify it again from the other side of the interface.
24+
In other words, our wrappers are not views,
25+
instead they are independent instantiations of the same (or as similar as possible) data models.
26+
27+
For example, if I have an object in Python
28+
and I pass this to a wrapped Fortran function which alters some attribute of this object,
29+
that modification will only happen on the Fortran side,
30+
the original Python object will remain unchanged
31+
(as a note, to see the result, we must return a new Python object from the Fortran wrapper).
32+
33+
This assumption makes ownership and memory management clear.
34+
We do not need to keep instances around as views
35+
and therefore do not need to worry about consistency across the Python-Fortran interface.
36+
Instead, we simply pass data back and forth,
37+
and the normal rules of data consistency within each programming language apply.
38+
39+
To actually pass derived types back and forth across the Python-Fortran interface,
40+
we introduce a 'manager' module for all derived types.
41+
42+
The manager module has two key components:
43+
44+
1. an allocatable array of instances of the derived type it manages,
45+
call this `instance_array`.
46+
The array of instances are instances which the manager owns.
47+
In practice, they are essentially temporary variables.
48+
1. an allocatable array of logical (boolean) values,
49+
call this `available_array`.
50+
The convention is that, if `available_array(i)` is `.true.`,
51+
where `i` is an integer,
52+
then the instance at `instance_array(i)` is available for the manager to use.
53+
Otherwise, the manager assumes that the instance is already being used for some purpose
54+
and therefore cannot be used for whatever operation is currently being performed.
55+
56+
This setup allows us to effectively pass derived types back and forth between Python and Fortran.
57+
58+
Whenever we need to return a derived type (or derived types) to Python, we:
59+
60+
1. get the derived type(s) from whatever Fortran function or subroutine created it,
61+
call this `derived_type_original`
62+
1. find an index, `idx`, in `available_array` such that `available_array(idx)` is `.true.`
63+
1. set `instance_array(idx)` equal to `derived_type_original`
64+
1. we return `idx` to Python
65+
- `idx` is an integer (or integers), so we can return this easily to Python using `f2py`
66+
1. we then create a Python object (or objects) with an API that mirrors `derived_type_original`
67+
using the class method `from_instance_index`.
68+
This class method is auto-generated via `pyfgen`
69+
(TODO: implement auto-generation)
70+
and handles retrieval of all the attribute values of `derived_type_original`
71+
from Fortran and sets them on the Python object that is being instantiated
72+
- we can do this as, if you dig down deep enough, all attributes eventually
73+
become primitive types which can be passed back and forth using `f2py`,
74+
it can just be that multiple levels of recursion are needed
75+
if you have derived types that themselves have derived type attributes
76+
1. we then call the wrapper module's `finalise_instance` function to free the (temporary) instance(s)
77+
that was used by the manager
78+
(we cannot access the manager module directly as it cannot be wrapped by `f2py`)
79+
- this instance is no longer needed because all the data has been transferred to Python
80+
1. we end up with a Python instance(s) that has the result
81+
and no extra/leftover memory footprint in Fortran
82+
(and leave Fortran to decide whether to clean up `derived_type_original` or not)
83+
84+
Whenever we need to pass a derived type (or derived types) to Fortran, we:
85+
86+
1. get an instance index we can use to communicate with Fortran
87+
1. call the Python object's `build_fortran_instance` method,
88+
which returns the instance index where the object was created on the Fortran side.
89+
Under the hood, this calls the wrapper module's
90+
`get_free_instance_index` and `build_instance` functions
91+
1. on the Fortran side, there is now an instantiated derived type, ready for use
92+
1. call the wrapped Fortran function of interest,
93+
except we pass the instance index instead of the actual Python object itself
94+
1. on the Fortran side, retrieve the instantiated index from the manager module
95+
and use this to call the Fortran function/subroutine of interest
96+
1. return the result from Fortran back to Python
97+
1. call the wrapper module's `finalise_instance` function to free the (temporary) instance
98+
that was used to pass the instance in the first place
99+
- this instance is no longer needed because all the data has been transferred and used by Fortran
100+
1. we end up with the result of the Fortran callable back in Python
101+
and no extra/leftover memory footprint in Fortran from the instance created by the manager module
102+
103+
## Further background
104+
105+
We initially started this project and took quite a different route.
106+
The reason was that we were actually solving a different problem.
107+
What we were trying to do was to provide views into underlying Fortran instances.
108+
For example, we wanted to enable the following:
109+
110+
```python
111+
>>> from some_fortran_wrapper import SomeWrappedFortranDerivedType
112+
113+
114+
>>> inst = SomeWrappedFortranDerivedType(value1=2, value2="hi")
115+
>>> inst2 = inst
116+
>>> inst.value1 = 5
117+
>>> # Updating the view via `inst` also affects `inst2`
118+
>>> inst2.value1
119+
5
120+
```
121+
122+
Supporting views like this introduces a whole bunch of headaches,
123+
mainly due to consistency and memory management.
124+
125+
A first headache is consistency.
126+
Consider the following, which is a common gotcha with numpy
127+
128+
```python
129+
>>> import numpy as np
130+
>>>
131+
>>> a = np.array([1.2, 2.2, 2.5])
132+
>>> b = a
133+
>>> a[2] = 0.0
134+
>>> # b has been updated too - many users don't expect this
135+
>>> b
136+
array([1.2, 2.2, 0. ])
137+
```
138+
139+
The second is memory management.
140+
For example, in the example above, if I delete variable `a`,
141+
what should variable `b` become?
142+
143+
With numpy, it turns out that the answer is that `b` is unaffected
144+
145+
```python
146+
>>> del a
147+
>>> a
148+
Traceback (most recent call last):
149+
File "<stdin>", line 1, in <module>
150+
NameError: name 'a' is not defined
151+
>>> b
152+
array([1.2, 2.2, 0. ])
153+
```
154+
155+
However, we would argue that this is not the only possibility.
156+
It could also be that `b` should become undefined,
157+
as the underlying array it views has been deleted.
158+
Doing it like this must also be very complicated for numpy,
159+
as they need to keep track of how many references
160+
there are to the array underlying the Python variables
161+
to know whether to actually free the memory or not.
162+
163+
We don't want to solve these headaches,
164+
which is why our solution does not support views,
165+
instead only supporting the passing of data across the Python-Fortran interface
166+
(which ensures that ownership is clear at all times
167+
and normal Python rules apply in Python
168+
(which doesn't mean there aren't gotchas, just that we won't introduce any new gotchas)).
169+
170+
## Other solutions we rejected
171+
172+
### Provide views rather than passing data
173+
174+
Note: this section was never properly finished.
175+
Once we started trying to write it,
176+
we realised how hard it would be to avoid weird edge cases
177+
so we stopped and changed to [our current solution][our-solution].
178+
179+
To pass derived types back and forth across the Python-Fortran interface,
180+
we introduce a 'manager' module for all derived types.
181+
This manager module is responsible for managing derived type instances
182+
that are passed across the Python-Fortran interface
183+
and is needed because we can't pass them directly using f2py.
184+
185+
The manager module has two key components:
186+
187+
1. an allocatable array of instances of the derived type it manages
188+
1. an allocatable array of logical (boolean) values
189+
190+
The array of instances are instances which the manager owns.
191+
It holds onto these: can instantiate them, can make them have the same values
192+
as results from Fortran functions etc.
193+
(I think we need to decide whether this is an array of instances
194+
or an array of pointers to instances (although I don't think that's a thing https://fortran-lang.discourse.group/t/arrays-of-pointers/4851/6,
195+
so doing something like this might require yet another layer of abstraction).
196+
Array of instances means we have to do quite some data copying
197+
and be careful about changes made on the Fortran side propagating to the Python side,
198+
I think (although we have to test as I don't know enough about whether Fortran is pass by reference or pass by value by default),
199+
array of pointers would mean change propagation should be more automatic.
200+
We're going to have to define our requirements and tests quite carefully then see what works and what doesn't.
201+
I think this is also why I introduced the 'no setters' views,
202+
as some changes just can't be propagated back to Fortran in a 'permanent' way.
203+
We should probably read this and do some thinking: https://stackoverflow.com/questions/4730065/why-does-a-fortran-pointer-require-a-target).
204+
205+
Whenever we need to return a derived type to Python,
206+
we follow a recipe like the below:
207+
208+
1. we firstly ask the manager to give us an index (i.e. an integer) such that `logical_array(index)` is `.false.`.
209+
The convention is that `logical_array(index)` is `.false.` means that `instance_array(index)` is available for use.
210+
1. We set `logical_array(index)` equal to `.true.`, making clear that we are now using `instance_array(index)`
211+
1. We set the value of `instance_array(index)` to match the the derived type that we want to return
212+
1. We return the index value (i.e. an integer) to Python
213+
1. The Python side just holds onto this integer
214+
1. When we want to get attributes (i.e. values) of the derived type,
215+
we pass the index value (i.e. an integer) of interest from Python back to Fortran
216+
1. The manager gets the derived type at `instance_array(index)` and then can return the atribute of interest back to Python
217+
1. When we want to set attributes (i.e. values) of the derived type,
218+
we pass the index value (i.e. an integer) of interest and the value to set from Python back to Fortran
219+
1. The manager gets the derived type at `instance_array(index)` and then sets the desired atribute of interest on the Fortran side
220+
1. When we finalise an instance from Python,
221+
we pass the index value (i.e. an integer) of interest from Python back to Fortran
222+
and then call any finalisation routines on `instance_array(index)` on the Fortran side,
223+
while also setting `logical_array(index)` back to `.false.`, marking `instance_array(index)`
224+
as being available for use for another purpose
225+
226+
Doing it this means that ownership is easier to manage.
227+
Let's assume we have two Python instances backed by the same Fortran instance,
228+
call them `PythonObjA` and `PythonObjB`.
229+
If we finalise the Fortran instance via `PythonObjA`, then `logical_array(index)` will now be marked as `.false.`.
230+
Then, if we try and use this instance via `PythonObjB`,
231+
we will see that `logical_array(index)` is `.false.`,
232+
hence we know that the object has been finalised already hence the view that `PythonObjB` has is no longer valid.
233+
(I can see an edge case where, we finalise via `PythonObjA`,
234+
then initialise a new object that gets the (now free) instance index
235+
used by `PythonObjB`, so when we look again via `PythonObjB`,
236+
we see the new object, which could be very confusing.
237+
We should a) test this to see if we can re-create such an edge case
238+
then b) consider a fix (maybe we need an extra array which counts how many times
239+
this index has been initialised and finalised so we can tell if we're still
240+
looking at the same initialisation or a new one that has happened since we last looked).)
241+
242+
This solution allows us to a) only pass integers across the Python-Fortran interface
243+
(so we can use f2py) and b) keep track of ownership.
244+
The tradeoff is that we use more memory (because we have arrays of instances and logicals),
245+
are slightly slower (as we have extra layers of lookup to do)
246+
and have slow reallocation calls sometimes (when we need to increase the number of available instances dynamically).
247+
There is no perfect solution, and we think this way strikes the right balance of
248+
'just works' for most users while also offering access to fine-grained memory control for 'power users'.
249+
250+
### Pass pointers back and forth
251+
252+
Example repository: https://github.com/Nicholaswogan/f2py-with-derived-types
253+
254+
Another option is to pass pointers to objects back and forth.
255+
We tried this initially.
256+
Where this falls over is in ownership.
257+
Basically, the situation that doesn't work is this.
258+
259+
From Python, I create an object which is backed by a Fortran derived type.
260+
Call this `PythonObjA`.
261+
From Python, I create another object which is backed by the same Fortran derived type instance i.e. I get a pointer to the same Fortran derived type instance.
262+
Call this `PythonObjB`.
263+
If I now finalise `PythonObjA` from Python, this causes the following to happen.
264+
The pointer that was used by `PythonObjA` is now pointing to `null`.
265+
This is fine.
266+
However, the pointer that is being used by `PythonObjB` is now in an undefined state
267+
(see e.g. community.intel.com/t5/Intel-Fortran-Compiler/DEALLOCATING-DATA-TYPE-POINTERS/m-p/982338#M100027
268+
or https://www.ibm.com/docs/en/xl-fortran-aix/16.1.0?topic=attributes-deallocate).
269+
As a result, whenever I try to do anything with `PythonObjB`,
270+
the result cannot be predicted and there is no way to check
271+
(see e.g. https://stackoverflow.com/questions/72140217/can-you-test-for-nullpointers-in-fortran),
272+
either from Python or Fortran, what the state of the pointer used by `PythonObjB` is
273+
(it is undefined).
274+
275+
This unresolvable problem is why we don't use the purely pointer-based solution
276+
and instead go for a slightly more involved solution with a much clearer ownership model/logic.
277+
We could do something like add a reference counter or some other solution to make this work.
278+
This feels very complicated though.
279+
General advice also seems to be to avoid pointers where possible
280+
(community.intel.com/t5/Intel-Fortran-Compiler/how-to-test-if-pointer-array-is-allocated/m-p/1138643#M136486),
281+
prefering allocatable instead, which has also helped shape our current solution.

docs/how-to-guides/basic-calculation.md

Lines changed: 0 additions & 10 deletions
This file was deleted.

0 commit comments

Comments
 (0)