-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pytorch-forecasting
and dsipts
v2 API design
#39
base: main
Are you sure you want to change the base?
Conversation
(bringing the discussion from discord to here) |
About Given that, should this be positioned at "model" level instead? |
Yes, that is also what @thawn said, let's try to have as few tensors as possible. My answer: perhaps - but it is hard to actually write it down. Some comments:
|
I think not, since it is a "data" concern:
|
What I'm thinking about is that static variables are a specific kind of past and future known exogenous variables, that are constant throughout time. In this sense, should we really separate them? I feel they are essentially the same in a "meta" level. The difference is how the model wants to use them. One example: we could use fourier features as static variables in TFT. I'm not sure if there's something that do not allow us to use it that way, since these features are known for every timepoint in past and future. |
More thoughts:
|
To me, this is not correct, on the conceptual level. The formal model I typically adopt is that all data points are indexed. Time-varying (or "dynamic") variables change over time, and obervations (entries, values, etc) are indexed by: (instance index, variable name, time index). Whereas, static variables are only indexed by (instance index, variable name). If you adopt the usual model of hierarchical data, where a combination of indices indexes an observation, it becomes clear that static variables live at different "levels" in the indexing hierarchy. Mapping static variables on a column on the lower level (plus time index) has them being constant as a side effect in representation, rather than as a conceptual feature. |
There are two ideas here, I believe: (a) nesting metadata, and (b) returning metadata not by I am very open to (a) and think this is perhaps a good idea. Regarding (b), could you explain how the retrieval of the metadata would work? The
Yes, that would be a great idea - do you want to draw up a design? And, surely this exists elsewhere, too? |
I understand your point that, since they can be indexed by different amout of indexes while keeping the information, they are and can be classified as different things. Nevertheless, I believe that the challenge here is projecting this higher dimensional space of concepts (e.g. classifying variables by instance, variable name, time index, discrete or continuous, etc) into a lower dimensional space that still makes the task (implementing different NN-based forecasters) feasible. In this regard, the separation of data in 1. target variable, 2. features with past knowns, 3. features with past/future knowns is possibly the lowest dimensional representation, and that can be further subdivided into other categories/concepts, such as features with past/future knowns that are static and do not change overtime (we can imagine it as a tree).
I thinking in the opposite direction: static variables are features, they belong to this "feature" concept: relevant data other than the target variable used as input to predict the target variable. They are a special subset of features, that do not change over time, but this does not place them outside the set of features. In particular, being part of this subset allows them to be represented with less index levels as you mentioned. Some models, such as DeepAR, may still want to use "static variables" as exogenous (I believe this is the case, by inspecting the implementation and the property |
good point... If this is a direction we would like to go, I believe there would be possibly two solutions:
|
But, this is not an intrinsic property of the static varibales. It is how the model chooses to use them. So it is not a data property but a modelling choice, and specifically a "reduction" choice where we reformat the data (like in the reduction forecaster, just a different type of making the data "rectangular-formatted") Therefore, with the domain driven design philosophy, the choice of reformatting the static variables falls with the model rather than the data layer.
I wanted to do that too but was voted down and convinced by good arguments in the Jan 20 meeting, reference: sktime/pytorch-forecasting#1736 The key argument for me was, current vignettes in all packages reviewed rely on the default
I think that is a hypothetical possibility but a bad idea. We would be rewriting layers outside the scope of the package. I still have nightmares from our attempts to rewrite @prockenschaub did great work, and I think our designs would have been successful in the end - it just turned out we were getting pulled into the scope of rewriting The lesson from that is, I suppose, if you are building a house and suddenly find yourself wanting to tear down your neighbour's and then the entire street, there is maybe something architectural to rethink. |
Hi Everyone! I have an idea for D1 layer, not sure if this will work but I thought to reduce the metadata by using dicts. D1
In this way we have a little less metadata |
As we are using just (Although not sure how much of it is feasible😅) example: class MyModel(LightningModule):
def setup(self, stage=None):
dataset = self.trainer.datamodule.train_dataset
self.col_map = dataset.col_map
self.st_col_type = dataset.st_col_type
self.y_types = dataset.y_types
self.x_types = dataset.x_types
def forward(self, x):
# Now we have metadata available for processing for forward
pass
def training_step(self, batch, batch_idx):
# training logic
pass
More examples: https://lightning.ai/docs/pytorch/stable/data/datamodule.html#what-is-a-datamodule Or we can do this: class DecoderEncoderData(Dataset):
def __init__(self, tsd: PandasTSDataSet, **params):
self.tsd = tsd # Store dataset reference
# Access metadata from dataset (D1)
self.col_map = tsd.col_map
self.st_col_type = tsd.st_col_type
self.y_types = tsd.y_types
self.x_types = tsd.x_types
def __getitem__(self, idx):
sample = self.tsd[idx]
return sample
|
A doubt: are we using |
What I meant is not using this tensor in every part of the code. Just having a code encapsulating the input data tensor with this metadata. This something documented in pytorch's documentation
I would argue that "static" features are not a real phenomenon; they are a separation made in the original paper. Nothing impedes the usage of other past and future known features in the same way that "static" features are used. One example of a use case that might encounter difficulties when separating static and non-static variables in the data layer is as follows: What if I want to use non-constant features (such as Fourier terms, which are known for both the past and future) in the same way that static variables are used in TFT? This approach could be interesting if we want the gating mechanisms to depend on seasonality. For example, holidays have different effects depending on the day of the week, and using sine and cosine functions as static features could be helpful in this scenario. |
hi @phoeenniixx ! I believe it is |
Thanks for clarifying this! @felipeangelimvieira |
@phoeenniixx, thanks a lot for your useful thoughts!
I think the latter is a rebrand of the former, the package and hte project have been renamed.
Interesting - how would the entire workflow look like if we do this? Though, using
That's a very interesting idea! Following up on this, I would actually put even more in single dicts, and use strings (like in pandas) for types: D1
|
Thanks for the comments @fkiraly !
Well I have not thought this through but we can do something like this: # Layer D1
class PandasTSDataSet:
def __init__(self, df, y_cols, x_cols, st_cols, y_types, x_types, st_types):
self.df = df
self.metadata = [metadata]
def __getitem__(self, idx):
# Returns only tensors, no metadata
return {
't': tensor(...),
'y': tensor(...),
'x': tensor(...),
'group': tensor(...),
'st': tensor(...),
't_f': tensor(...),
'x_f': tensor(...),
}
def get_metadata(self):
return self.metadata
# Layer D2
class DecoderEncoderDataModule(LightningDataModule):
def __init__(self, d1_dataset, batch_size=32, num_workers=4):
super().__init__()
# initialize other params
self.d1_dataset = d1_dataset
self.batch_size = batch_size
self.metadata = None # Will be set in setup()
def setup(self, stage=None):
# Get metadata from D1 layer during setup
self.metadata = self.d1_dataset.get_metadata()
if stage == 'fit' or stage is None:
# Any train-specific setup
pass
if stage == 'test' or stage is None:
# Any test-specific setup
pass
def train_dataloader(self):
return DataLoader(
self.d1_dataset,
batch_size=self.batch_size,
**other_params
)
# Layer M
class DecoderEncoderModel(LightningModule):
def __init__(self):
super().__init__()
self.metadata = None
def setup(self, stage=None):
# Get metadata from datamodule during setup
self.metadata = self.trainer.datamodule.metadata
# Initialize layer T model using metadata
def forward(self, x):
# forward logic
pass
Well yes, but I found this was the closest way where data and related fields are handled in the model. Or as I mentioned earlier, we can just pass the metadata directly to other layers by doing something similar to this. Here we can even define a function like class DecoderEncoderData(Dataset):
def __init__(self, tsd: PandasTSDataSet, **params):
self.tsd = tsd # Store dataset reference
# Access metadata from dataset (D1)
self.metadata = tsd.get_metadata()
def __getitem__(self, idx):
sample = self.tsd[idx]
# other required additions to ``sample``
return sample In this way, we don't deviate from
Yeah! we can do that as well. Didn't strike me at that time. Thanks! |
@phoeenniixx, brilliant! Great thinking! In restrospect - looking at your post, it looks like with layer D2 we were trying to reinvent the Nice! I will need some time to digest, but I think we are getting close now. |
Thanks! @fkiraly
Well I didn't fiind anyother way in |
An enhancement proposal and design document towards v2 of
pytorch-forecasting
anddsipts
.More context here: sktime/pytorch-forecasting#1736