Consider revamping the structure of the HDF5 files

I'd like to revise the decision to use pytables vs. hd5py, and I think we got a few things wrong in the initial layout that I would like to think through properly.

* we should probably store all the metadata as one or more tables, rather than fiddling with the attributes (I think this is coming from the experience with FITS headers, but it is not necessarily what you want to do in other formats)
* by looking around, I came across this idea of storing variable-sized arrays in the form of a long, unique array storing the values, and another scalar, event-based array storing the offsets in the big array for each event. 

Here is a suggestion from chatGPT

```
/meta/
    detector_config      (dataset or group of datasets)
    daq_settings         (dataset or group of datasets)
    run_info             (attrs: run_id, start_time, software_version, ...)

/events/
    trigger_id           (N,) int64
    timestamp            (N,) int64 or float64
    roi                  (N,4) int32  # e.g. [x0, y0, w, h] or [x_min, x_max, y_min, y_max]
    pha_offsets          (N+1,) int64 # prefix-sum offsets into flat pha_values
    pha_values           (M,) uint16/uint32  # concatenation of all per-event pixel PHAs
    # optional:
    pha_shape            (N,2) int16  # if ROI implies pixel count, may be redundant
    event_flags          (N,) uint32

/sim/
    truth/
        trigger_id       (N_sim,) int64   # or event_index
        ... scalar truth columns ...

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider revamping the structure of the HDF5 files #78

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Consider revamping the structure of the HDF5 files #78

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions