Replies: 3 comments 6 replies
-
Hi Enrico, Only some partial answers and questions here... First, regarding the shape of the data. You wrote:
If so, I don't see any problem in principle with the simulator producing sequences of different lengths. In one way of thinking about it all that data has the same "shape" in that it lives in the same abstract space. Similar to a Poisson point process that emits a variable number of events where each event might be vector-valued. But if this is the case, the likelihood you are trying to model is going to be more challenging and you will need a NN model that can handle your shape of data. I don't think you want to imagine this data as being in R^N and pad, you want to think of it like a sequence and probably use an autoregressive model. If the termination time of the simulator is physical / meaningful then that time is important part of the data. In that case you might imagine the output data being {(t_i, x(t_i)} pairs where the last entry is your t_2. It's like the end of a sentence and modeling p(stop) and 1-p(stop) are important parts of the implicit likelihood. If you are randomly sampling some time between Finally, the It's not totally clear to me if |
Beta Was this translation helpful? Give feedback.
-
Hi Enrico, I will add a few more swyft-specific aspects to the discussion: Simulator output with different shapes is something that can be handled with simulation-based inference or Bayesian inference in general. In that case, the shape of the output (for instance the number of bins in a time sequence, or the number of observed gamma-ray photons during some observational period) is then itself a random variable. Observing an event with a specific shape would carry information about the underlying model parameters. In your case, however, I'm not sure if you are in that situation. What matters is the shape of the observational data, not the shape of the physics simulator output. If I understand correctly, in your example your observational data would always cover [t_0, t_2], no matter whether t_1 < t_2 or t_1 > t_2. For values t_1 < t < t_2 you then simply have to simulate what the detector would measure if the event as shorter than your observational period. If you simulated SN light curves, for instance, I guess that would mean doing zero padding. In the case where other complementary observations would provide additional information about t_1, you could feed these observations into swyft in addition to the time sequence data, but details will depend on the specifics of your situation. In that case you would have multiple observational states, which in swyft means the dictionary-valued output of the simulator would provide separately the time-sequence data as well as the estimator for t_1. In cases where the observational data itself varies in shape, e.g. where some of the time bins are missing, and you know for a given observation deterministically that they are missing, you could provide the zero-padded time sequence, and additionally a boolean vector that indicates which bins are measured and which ones are not. You then need some appropriate network structure, in swyft in the Implementing custom "head networks" for handling such situations is possible in swyft right now, but can be a bit tricky. Please ask if you have questions! You can look for examples with Best, |
Beta Was this translation helpful? Give feedback.
-
Thanks Kyle and Christoph, despite my very vague and general question you both touched upon very important issues that I was considering. Let me make some remarks which would make the problem above more specific and I will also try to refer to each of your answers. My original idea was to simply assign zero likelihood to all the model parameters resulting in a t_1 shorter than t_2, as mentioned by @cranmer, and actually my t_2 is given by observation data while t_1 is deterministically determined by other parameters of the simulator. This is a rather common choice if you think that the simulator is the entire generative model for the data: to compare generated data and observed data they need to have the same shape (I think this is a requirement). With observation data for a fixed t_i sequence (no sampling of t_obs in this case @cranmer ) and a simulator that may fail to cover the entire time sequence for physical reasons, I have done simulation based inference with However, this will “throw away” the entire set of parameters, while sometimes I still would like to check what data the simulator outputs and how it compares to the observations, even if only partially. Let me briefly comment about the Head network. Those can really be useful to compress information and automatically extract summary statistics, even in complicated cases like missing data. A priori, there is no “optimal” network structure, so one can craft something and do trials and errors, I guess. One more thing for @cweniger : observations come with error bars due to instrumental precision and other modeling effects on the experimental side (common in astrophysics and particle physics) Am I correct in my understanding that Thank you very much. I am really appreciative of all the work you guys are putting in developing these tools for simulated based inference! |
Beta Was this translation helpful? Give feedback.
-
Hello,
I have been using
swyft
(andsbi
) for simulation-based inference in different physical systems (mostly for my own curiosity and investigating the connection with parameter inference of ODEs and PDEs).I would like to ask a question that might be more general than how I formulated it in the title of this discussion.
Let me try to introduce the problem with an example:
t₀
and a final timet₁
x₀
when computing the posterior is always in a fixed range[t₀ , t₂ ]
wheret₂
can be different and larger thant₁
t₁ < t₂
depending on the parameter values in the dynamical model (so it depends on the sampled parameters from the prior)In
swyft
it is possible to outputnan
from the simulator in some cases where there is failure or when we do not want those parameters to be considered in inference (andsbi
does the same).It is possible to have the simulator output data with different shapes (e.g.
t₁-t₀
values ort₂-t₀
values) at each step. However, I don't think this would be ideal for training and "padding" the remaining values (e.g.t₂-t₁
values) withnan
might result in completely discarding those samples during training. "padding" with different values might work in some cases, but I wonder ifswyft
has some built-in mechanism for this.I would also be interested in understanding if having different output shapes for different samples than the actual observational data we need for the posterior actually makes sense at all in the context of simulation-based inference (or bayesian inference in general, for example).
Thank you,
Enrico
Beta Was this translation helpful? Give feedback.
All reactions