-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
I'd like to revise the decision to use pytables vs. hd5py, and I think we got a few things wrong in the initial layout that I would like to think through properly.
- we should probably store all the metadata as one or more tables, rather than fiddling with the attributes (I think this is coming from the experience with FITS headers, but it is not necessarily what you want to do in other formats)
- by looking around, I came across this idea of storing variable-sized arrays in the form of a long, unique array storing the values, and another scalar, event-based array storing the offsets in the big array for each event.
Here is a suggestion from chatGPT
/meta/
detector_config (dataset or group of datasets)
daq_settings (dataset or group of datasets)
run_info (attrs: run_id, start_time, software_version, ...)
/events/
trigger_id (N,) int64
timestamp (N,) int64 or float64
roi (N,4) int32 # e.g. [x0, y0, w, h] or [x_min, x_max, y_min, y_max]
pha_offsets (N+1,) int64 # prefix-sum offsets into flat pha_values
pha_values (M,) uint16/uint32 # concatenation of all per-event pixel PHAs
# optional:
pha_shape (N,2) int16 # if ROI implies pixel count, may be redundant
event_flags (N,) uint32
/sim/
truth/
trigger_id (N_sim,) int64 # or event_index
... scalar truth columns ...
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels