-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add time clock to runtime #170
Conversation
…s returns a xr.DataArray. However, I cannot set xs.variables in the initialize step: 'AttributeError: cannot set attribute'
Thanks @joeperdefloep! However, I don't think The "placeholder" that I suggest in https://github.com/benbovy/xarray-simlab/issues/155#issuecomment-775507738 is for addressing the case where we want to use the main clock as an explicit dimension when declaring xarray-simlab variables in process classes. It is a different (but related) problem than this PR. We would need something a bit more advanced than a string, i.e., a custom type that xarray-simlab understands so that it can "replace" it with the actual dimension label corresponding to the main clock (later set by the user) when creating xarray Dataset objects. Although for an entirely different purpose, an example of such type (sentinel class, singleton) is attr.NOTHING. Updating runtime context entries with the main clock values (this PR) should be addressed afterwards IMO. Runtime context entries are already hard-coded, so we can just use the |
Co-authored-by: Benoit Bovy <bbovy@gfz-potsdam.de>
Co-authored-by: Benoit Bovy <bbovy@gfz-potsdam.de>
Co-authored-by: Benoit Bovy <bbovy@gfz-potsdam.de>
Hmm I'm not sure to understand what you have implemented in your last commit. Just to be sure, what we'd like to do is something like in the small example below or do you have something else in mind? @xs.process
class Foo:
var = xs.variable(dims=xs.MAIN_CLOCK, intent='out')
@xs.runtime(args='main_clock_array')
def initialize(self, main_clock):
self.var = main_clock * 2
m = xs.Model({'foo': Foo})
ds_in = xs.create_setup(
model=m,
clocks={'time': range(5)},
output_vars={'foo__var': None}
)
ds_out = ds_in.xsimlab.run(model=m)
# should print a DataArray with dimension 'time'
# and values [0, 2, 4, 6, 8]
print(ds_out.foo__var) For this example, we need to implement new functionality in xarray-simlab in order to support the two lines below:
This PR implements 2, and I suggest implementing 1 in a separate PR (Ideally, we should address 1 first) What I have in mind for 1 is just a very basic placeholder |
Yes, that was exactly what I wanted to implement! Well, actually a bit more: @xs.process
class dBiomass:
"""calculates biomass change based on light intensity
"""
maxrad = xs.variable(dims=xs.MAIN_CLOCK,intent='out')
biomass = xs.variable(intent='out')
@xs.runtime(args=['main_clock_values','main_clock_array'])
def initialize(self,clock_values,clock_array):
print(clock_values)
print(clock_array[xs.MAIN_CLOCK])
self.biomass=0
self.maxrad = main_clock
@xs.runtime(args='step_delta')
def run_step(self,dt):
self._d_biomass = dt*self.maxrad
def finalize_step(self):
self.biomass += self._d_biomass
biomass_model_raw = xs.Model({'dBiomass':dBiomass})
ds_in = xs.create_setup(
model=biomass_model_raw,
clocks={
'day':range(100)
},
master_clock='day',
input_vars={},
output_vars={'dBiomass__biomass':'day'}
) maybe end-to-end testing is not the most efficient way of adding a new feature, how do you normally tackle such things, or how would I go about debugging 1. without first implementing 2? Writing tests maybe? 🙊 Serious question though... this implements the following lines:
There is another possible problem when a process dimension happens to have the same name as the main clock dimension e.g. |
Isn't this PR and #155 initially about implementing 2? You could test 2 without having 1 implemented if that's what you mean, e.g., @xs.process
class Foo:
var = xs.variable(dims=(), intent='out')
@xs.runtime(args='main_clock_array')
def initialize(self, main_clock):
self.var = np.sum(main_clock)
m = xs.Model({'foo': Foo})
ds_in = xs.create_setup(
model=m,
clocks={'time': range(5)},
output_vars={'foo__var': None}
)
ds_out = ds_in.xsimlab.run(model=m)
assert ds_out.foo__var == 10
One major drawback is that this approach is not thread safe.
Not sure I understand, but I guess you mean that we should take care of the case where users provide time-varying values for a variable like
This is not possible in xarray-simlab, each option must have a different number of dimensions.
It would certainly be useful to access the main clock as a DataArray, but why
We should raise an error in this case. It either to the model user to choose another label for the main clock, or the model developer to use another, potentially less conflicting label in process classes. |
Co-authored-by: Benoit Bovy <bbovy@gfz-potsdam.de>
Well, it is about making the simple example work. The question was how to test 1 without 2. But I think the following example does that (testing 1 without access to time_array = np.array([0,1,2,3])
@xs.process
class Foo:
#intent cannot be 'in' if we want to set it in initialize
bar = xs.variable(dims=xs.MAIN_CLOCK,intent='in',static=True)
baz = xs.variable(intent='out')
def initialize(self):
self.bar = time_array*2
self.baz=0
@xs.runtime(args='step_delta')
def run_step(self,dt):
#now self.bar is actually [0,2,4,6]
#should be 0-6 in different steps
self.baz += dt*self.bar
m = xs.Model({'foo':Foo})
ds_in = xs.create_setup(
model=m,
clocks={
'time':time_array
},
input_vars={
'foo__bar': 5,
},
output_vars={
#add output var to get error: dims don't match
# 'Biomass__biomass':'time'
}
)
ds_in.xsimlab.run(m)
ok yes, if a user was to create multiple models with multiple with different main_clock dimensions, and processes that refer to
actually, setting What I meant here is that in the above example, in
The comment was from a user-perspective. For example if the user has clock in a datetime and wants to be able to
footnote: @xs.process
class Foo:
bar = xs.variable(intent='in')
baz = xs.variable(intent='out')
def initialize(self):
#this gives a keyerror on time-varying data
#could be solved with #155
self.baz=self.bar
@xs.runtime(args='step_delta')
def run_step(self,dt):
self.baz += dt*self.bar
m = xs.Model({'foo':Foo})
ds_in = xs.create_setup(
model=m,
clocks={
'time':time_array
},
input_vars={
'foo__bar': 5,
},
output_vars={
'foo__baz':'time'
}
)
# ds_in = ds_in.xsimlab.update_vars(model=m,input_vars={'foo':{'bar':ds_in.time*5}})
ds_in.xsimlab.run(m).foo__baz.plot() |
Looks like we don't really agree on what should be Having a singleton class instantiated as Moreover, updating import xarray as xr
import xsimlab as xs
ds_in = xr.load_dataset('some_file.nc')
# xs.MAIN_CLOCK.main_clock_dim may not be correct
ds_in.xsimlab.run() A much better approach would be to have dim_labels = [self.mclock_dim if d is MAIN_CLOCK else d for d in dim_labels] put here. If we want to allow using |
If
Ok I see. Allow declaring input variables with a For the case of output variables set in |
Ok, so I properly implemented the singleton class (thanks for pointing out the right places! I was really at a loss there, as to how to implement it, it was indeed still very ugly).
hmm, maybe you're right. I added a link to the dim in
I added a check/ValueError in
Yes that works! (and it makes sense, since the user also defined it as a dimension.) It felt a bit weird at first, because I thought I wanted the same behaviour as time-dependent inputs Now I am a bit lost as to making tests... I think I could make a full example in test_model.py |
So most things work, but there is now the problem that an output variable will show up with two dimensions in the output: @xs.process
class Foo:
a = xs.variable(intent="out", dims=(xs.MAIN_CLOCK))
@xs.runtime(args="main_clock_values")
def initialize(self, clock_values):
self.a = clock_values
@xs.runtime(args="step")
def run_step(self, step):
self.a[step] += 1
model = xs.Model({"foo": Foo})
ds_in = xs.create_setup(
model=model,
clocks={"clock": range(5)},
input_vars={},
output_vars={"foo__a": "clock"},
)
print(ds_in.xsimlab.run(model=model).foo__a) returns: <xarray.DataArray 'foo__a' (clock: 5)>
array([[1., 1., 2., 3., 4.],
[1., 2., 2., 3., 4.],
[1., 2., 3., 3., 4.],
[1., 2., 3., 4., 4.],
[1., 2., 3., 4., 4.]])
Coordinates:
* clock (clock) int64 0 1 2 3 4 where we would want a: <xarray.DataArray 'foo__a' (clock: 5)>
array([1.,2.,3.,4.,4.,])
Coordinates:
* clock (clock) int64 0 1 2 3 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!!
In your last example, you definitely want to save only one snapshot for foo__a
instead of saving snapshots for this unchanged value at each time step, i.e.,
ds_in = xs.create_setup(
...
output_vars={"foo__a": None},
)
I don't know if there is any use case where it would make sense to update the values a variable with xs.MAIN_CLOCK
dimension in run_step
and save multiple snapshots. If there's no case, we could raise when users want to save multiple snapshots, but we can implement this later (let's see how it goes).
xsimlab/drivers.py
Outdated
@@ -210,6 +209,9 @@ def _maybe_transpose(dataset, model, check_dims, batch_dim): | |||
if xr_var is None: | |||
continue | |||
|
|||
if any([MAIN_CLOCK in d for d in model.cache[var_key]["metadata"]["dims"]]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's preferable for now that we raise an error when we create an xs.variable
if it has xs.MAIN_CLOCK
and its dimensions and intent != 'out'
, so that we catch this unsupported (yet?) case earlier.
@@ -241,6 +241,14 @@ def _create_zarr_dataset( | |||
f"its accepted dimension(s): {var_info['metadata']['dims']}" | |||
) | |||
|
|||
# set MAIN_CLOCK placeholder to main_clock dimension | |||
if self.mclock_dim in dim_labels and MAIN_CLOCK in dim_labels: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't allow declaring input variables with xs.MAIN_CLOCK
, this case shouldn't happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can still happen: when a user declares a variabke with dims
`(xs.MAIN_CLOCK,'clock')'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes! Hmm then we would probably want to check in Dataset.xsimlab.update_clocks
that the given clock labels (main clock and other clocks if provided) don't conflict with the dimension labels set in the model. I don't think we do this already.
For convenience we could add a property in xs.Model
that returns a list of all (unique) dimension labels declared in the model.
Let's fix that in another PR..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if that is the right place though: Am I right that a user may add a process after updating clocks, thus not catching it?
xsimlab/utils.py
Outdated
"""singleton class to be used as main clock dimension: update on runtime | ||
it has all behaviour that dimensions in a `xr.DataArray` normally have. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""singleton class to be used as main clock dimension: update on runtime | |
it has all behaviour that dimensions in a `xr.DataArray` normally have. | |
""" | |
"""Singleton class to be used as a placeholder of the main clock | |
dimension. | |
It will be replaced by the actual dimension label set during simulation setup | |
(i.e., ``main_clock`` argument). | |
""" |
Co-authored-by: Benoit Bovy <bbovy@gfz-potsdam.de>
Co-authored-by: Benoit Bovy <bbovy@gfz-potsdam.de>
… into access-clock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just left some minor comments.
@@ -29,7 +33,7 @@ def __new__(cls): | |||
return _MainClockDim._singleton | |||
|
|||
def __repr__(self): | |||
return "MAIN_CLOCK (uninitialized)" | |||
return "MAIN_CLOCK (undefined)" | |||
|
|||
|
|||
MAIN_CLOCK = _MainClockDim() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to add some docstrings for this attribute, e.g.,
MAIN_CLOCK = _MainClockDim()
"""
Sentinel to indicate simulation's main clock dimension, to be
replaced by the actual dimension label set in input/output datasets.
"""
and add it in the docs (probably in API references in the variables section, since we would use it when declaring variables).
self.b = clock_array * 2 | ||
assert clock_array.dims[0] == "clock" | ||
assert all(clock_array[clock_array.dims[0]].data == [0, 1, 2, 3]) | ||
|
||
@xs.runtime(args=["step_delta", "step"]) | ||
def run_step(self, dt, n): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could remove dt
if you don't use it.
Co-authored-by: Benoit Bovy <bbovy@gfz-potsdam.de>
Co-authored-by: Benoit Bovy <bbovy@gfz-potsdam.de>
Now, when we print the model, the |
That's intended behavior. The main clock label is not defined yet and may be different from one simulation to another. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the long wait @joeperdefloep. This looks all good to me! Just one remaining issue with the doc build. Maybe we can remove MAIN_CLOCK
from api.rst
here and address this later.
doc/api.rst
Outdated
@@ -154,7 +154,7 @@ Variable | |||
|
|||
.. autosummary:: | |||
:toctree: _api_generated/ | |||
|
|||
MAIN_CLOCK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, adding MAIN_CLOCK
here yields an error when building the docs with Sphinx (as a result the whole Variable
section is empty).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm I've checked that, and I don't get an error. That is, I did get a "could not import module MAIN_CLOCK" or something, when I didn't first python -m pip install .
, but after that it built properly.
e.g. make html
worked properly without errors.
link
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two last things and it should be all good:
- comment below
- could you update
whats_new.rst
, please?
.vscode/settings.json
Outdated
@@ -0,0 +1,3 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you remove this file, please? It should be ignored in the next PRs (#169).
Excellent @joeperdefloep! Many thanks for your great work here, and thanks for your patience! |
added placeholder, should work now
black . && flake8
whats-new.rst
for all changes andapi.rst
for new API