Skip to content

Commit

Permalink
improve flatfile metadata docstring
Browse files Browse the repository at this point in the history
  • Loading branch information
rizac committed Jun 17, 2024
1 parent 4a21d9a commit 4bc673e
Showing 1 changed file with 16 additions and 22 deletions.
38 changes: 16 additions & 22 deletions egsim/smtk/flatfile_metadata.yaml
Original file line number Diff line number Diff line change
@@ -1,26 +1,20 @@
# Flatfile metadata registry. Syntax (in YAML format):
#
# <column_name>:
# dtype: The data type (default when missing: null). Supported values are: null,
# int, float, str, bool, datetime (ISO formatted) or, if the column can
# take only a fixed number of possible values (aka "categories"), the list
# of values, which must be all the same supported data type. If non-null,
# the data type will be used for data validation
# default: The default value used to fill missing data (e.g. empty cell, null, NaN).
# Note that in Python pandas, missing data in integers and booleans (dtype
# int or bool) is invalid, so the presence or not of a default (usually 0
# for int and false for bool) will also affect data validation
# type: The column type (optional). Supported values are: rupture, site, distance,
# intensity. If a column's type and name match an OpenQuake ground motion
# property (rupture parameter, site parameter, distance measure) required to
# compute the model predictions or an observed intensity measure required to
# compute residuals, its values will be used in computation. Otherwise, the
# column is intended to be used for other purposes (e.g. IDs assignment)
# alias: The column alias(es), as string or list of strings, indicating valid
# alternative names for the column. If a type is provided, the OpenQuake
# name, if any, can also be provided here, but intensity measure columns
# cannot have aliases and must be spelled capitalized as in OpenQuake (e.g.
# PGA SA PGV)
# dtype: The data type: Supported values are: null (the default when missing), int,
# float, str, bool, datetime. Provide a list of values (all the same dtype)
# for categorical data, i.e. when the column can only take on one of the
# given values. If non-null, the data type will be used for data validation
# default: The default value used to fill missing data, e.g. empty cell, null, NaN.
# In Python pandas, all dtypes support missing data except int or bool: in
# these cases, supply a default if you want data validation to pass
# type: The column type. Supported values are: rupture, site, distance, intensity
# (rupture parameter, site parameter, distance measure, intensity measure).
# Required only if the column denotes an OpenQuake parameter or measure,
# optional otherwise
# alias: The column alias(es), as string or list of strings. If you want to rename
# an OpenQuake parameter or distance, set the OpenQuake name as alias.
# Note: intensity measure columns (e.g. PGA, SA, PGV) cannot have aliases
# help: The field help (optional), used to provide documentation
# ">" (with quotation marks because > and < are special characters in YAML)
# The minimum value (endpoint excluded) of the column data. Currently used
Expand Down Expand Up @@ -200,8 +194,8 @@ region:
default: 0
help: >-
The ESHM2020 attenuation cluster region to which the site belongs
(https://doi.org/10.1007/s10518-020-00899-9). 0 (default): unknown, 1: average / slower,
2: average / faster, 3: fast, 4: average, 5: very slow
(https://doi.org/10.1007/s10518-020-00899-9). 0 (default): unknown,
1: average / slower, 2: average / faster, 3: fast, 4: average, 5: very slow
type: site
geology:
dtype: ["CENOZOIC", "HOLOCENE", "JURASSIC-TRIASSIC", "CRETACEOUS", "PALEOZOIC",
Expand Down

0 comments on commit 4bc673e

Please sign in to comment.