Skip to content

Commit

Permalink
Support YAML config files
Browse files Browse the repository at this point in the history
  • Loading branch information
gipert committed Jan 16, 2025
1 parent 97a9e83 commit 368c739
Show file tree
Hide file tree
Showing 5 changed files with 169 additions and 140 deletions.
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ repos:
stages: [manual]

- repo: https://github.com/hadialqattan/pycln
rev: "v2.4.0"
rev: "v2.5.0"
hooks:
- id: pycln
args: ["--all"]
Expand Down
81 changes: 37 additions & 44 deletions docs/source/manuals/build_raw.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,45 +35,42 @@ friend), decode all the data it can and save it to an LH5 file named
.. tip::
Check the |build_raw| documentation for a full list of useful options.

When the *out_spec* argument is a dictionary or a string ending with ``.json``,
it is interpreted as a configuration dictionary or a JSON file containing it,
respectively. Technically, this dictionary configures a
When the *out_spec* argument is a dictionary or a string ending with ``.json``
or ``.yaml``, it is interpreted as a configuration dictionary or a file
containing it, respectively. Technically, this dictionary configures a
:class:`~.raw_buffer.RawBufferLibrary`.

.. tip::
The full configuration format specification is documented in depth in
:meth:`.raw_buffer.RawBufferLibrary.set_from_json_dict`.
:meth:`.raw_buffer.RawBufferLibrary.set_from_dict`.

Let's use the following configuration file as an example:

.. code-block::
:caption: ``raw-out-spec.json``
.. code-block:: yaml
:caption: ``raw-out-spec.yaml``
:linenos:
{
"ORFlashCamWaveformDecoder" : {
"group1-{key:07d}/raw" : {
"key_list" : [[1, 3], 9],
"out_stream" : "{filename}"
},
"group2-{key:07d}/raw" : {
"key_list" : [[11, 13]],
"out_stream" : "{filename}"
}
},
"OrcaHeaderDecoder" : {
"header-data" : {
"key_list" : ["*"],
"out_stream" : "{filename}"
}
},
"*" : {
"extra/{name}" : {
"key_list" : ["*"],
"out_stream" : "extra.lh5"
}
}
}
ORFlashCamWaveformDecoder:
"group1-{key:07d}/raw":
key_list:
- [1, 3]
- 9
out_stream: "{filename}"
"group2-{key:07d}/raw":
key_list:
- [11, 13]
out_stream: "{filename}"
OrcaHeaderDecoder:
header-data:
key_list: ["*"]
out_stream: "{filename}"
"*":
"extra/{name}":
key_list: ["*"]
out_stream: "extra.lh5"
The first-level keys specify the names of the
:class:`~.data_decoder.DataDecoder`-derived classes to be used in the
Expand Down Expand Up @@ -132,7 +129,7 @@ predefined variables are ``key`` and ``name``, but any other variable can be
expanded by passing its value to |build_raw| as keyword argument. For example,
for the the configuration shown above, ``filename`` must be defined like this: ::

build_raw("daq-data.orca", out_spec="raw-out-spec.json", filename="raw-data.lh5")
build_raw("daq-data.orca", out_spec="raw-out-spec.yaml", filename="raw-data.lh5")

.. note::
``key`` and ``name`` can be overloaded by keyword arguments in |build_raw|.
Expand Down Expand Up @@ -192,29 +189,25 @@ Convert files and save them in the original directory with the same filenames
$ # set maximum number of rows to be considered from each file
$ legend-daq2lh5 --max-rows 100 data/*.orca
Customize the group layout of the LH5 files in a JSON configuration file (see
Customize the group layout of the LH5 files in a YAML configuration file (see
above section):

.. code-block:: json
.. code-block:: yaml
{
"FCEventDecoder": {
"ch{key:0>3d}/raw": {
"key_list": [[0, 58]],
"out_stream": "{orig_basename}.lh5"
}
}
}
}
FCEventDecoder:
"ch{key:0>3d}/raw":
key_list:
- [0, 58]
out_stream: "{orig_basename}.lh5"
and pass it to the command line:

.. code-block:: console
$ legend-daq2lh5 --out-spec fcio-config.json data/*.fcio
$ legend-daq2lh5 --out-spec fcio-config.yaml data/*.fcio
.. note::
A special keyword ``orig_basename`` is automatically replaced in the JSON
A special keyword ``orig_basename`` is automatically replaced in the YAML
configuration by the original DAQ file name without extension. Such a
feature is useful to users that want to customize the HDF5 group layout
without having to worry about file naming. This keyword is only available
Expand Down
31 changes: 18 additions & 13 deletions src/daq2lh5/build_raw.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
from __future__ import annotations

import glob
import json
import logging
import os
import time
Expand All @@ -10,6 +9,7 @@
from lgdo import lh5
from tqdm.auto import tqdm

from . import utils
from .compass.compass_streamer import CompassStreamer
from .fc.fc_streamer import FCStreamer
from .llama.llama_streamer import LLAMAStreamer
Expand Down Expand Up @@ -49,12 +49,14 @@ def build_raw(
Specification for the output stream.
- if None, uses ``{in_stream}.lh5`` as the output filename.
- if a str not ending in ``.json``, interpreted as the output filename.
- if a str ending in ``.json``, interpreted as a filename containing
json-shorthand for the output specification (see :mod:`.raw_buffer`).
- if a JSON dict, should be a dict loaded from the json shorthand
notation for RawBufferLibraries (see :mod:`.raw_buffer`), which is
then used to build a :class:`.RawBufferLibrary`.
- if a str not ending with a config file extension, interpreted as the
output filename.
- if a str ending with a config file extension, interpreted as a
filename containing shorthand for the output specification (see
:mod:`.raw_buffer`).
- if a dict, should be a dict loaded from the shorthand notation for
RawBufferLibraries (see :mod:`.raw_buffer`), which is then used to
build a :class:`.RawBufferLibrary`.
- if a :class:`.RawBufferLibrary`, the mapping of data to output file /
group is taken from that.
Expand All @@ -72,8 +74,8 @@ def build_raw(
- if None, CompassDecoder will sacrifice the first packet to determine
waveform length
- if a str ending in ``.json``, interpreted as a filename containing
json-shorthand for the output specification (see
- if a str ending with a config file extension, interpreted as a
filename containing shorthand for the output specification (see
:mod:`.compass.compass_event_decoder`).
hdf5_settings
Expand Down Expand Up @@ -120,11 +122,14 @@ def build_raw(

# process out_spec and setup rb_lib if specified
rb_lib = None
if isinstance(out_spec, str) and out_spec.endswith(".json"):
with open(out_spec) as json_file:
out_spec = json.load(json_file)
allowed_exts = [ext for exts in utils.__file_extensions__.values() for ext in exts]
if isinstance(out_spec, str) and any(
[out_spec.endswith(ext) for ext in allowed_exts]
):
with open(out_spec) as f:
out_spec = utils.load_dict(f)
if isinstance(out_spec, dict):
out_spec = RawBufferLibrary(json_dict=out_spec, kw_dict=kwargs)
out_spec = RawBufferLibrary(config=out_spec, kw_dict=kwargs)
if isinstance(out_spec, RawBufferLibrary):
rb_lib = out_spec
# if no rb_lib, write all data to file
Expand Down
Loading

0 comments on commit 368c739

Please sign in to comment.