Skip to content

Commit

Permalink
feat: expand default metadata fields, make values.yaml concise throug…
Browse files Browse the repository at this point in the history
…h template functions and yaml anchors (#1935)

- Adds more metadata fields to be used by pathoplexus to values.yaml
- Configure a single abstract default organism in values.yaml that is reused by concrete ones to make values.yaml shorter (only what's differing between organisms needs to be duplicated). Inheritance is currently done via yaml anchor, aliases and merge keys
- Make values.yaml more compact by generating the previously separate a) ingest, b) preprocessing and c) inputFields configs from schema.metadata.
- Use default values in values.yaml to make it more compact, e.g. string is the default type, so one less line for most metadata fields to define.
- Introduce "header: Other" as the default header so as not to show fields under no header

Without the compactification, we'd be looking at multiple thousands of lines of yaml, which is not nice to work with (already now VS code and vim complain about the size)
  • Loading branch information
corneliusroemer authored May 21, 2024
1 parent 698b836 commit 1b7497a
Show file tree
Hide file tree
Showing 19 changed files with 1,205 additions and 1,472 deletions.
2 changes: 1 addition & 1 deletion deploy.py
Original file line number Diff line number Diff line change
Expand Up @@ -207,7 +207,7 @@ def generate_configs(from_live=False):
generate_config(helm_chart, 'templates/loculus-website-config.yaml', runtime_config_path, codespace_name, from_live)

ingest_configmap_path = TEMP_DIR / 'config.yaml'
ingest_template_path = 'templates/loculus-ingest-config.yaml'
ingest_template_path = 'templates/ingest-config.yaml'
ingest_configout_path = TEMP_DIR / 'ingest-config.yaml'
generate_config(helm_chart, ingest_template_path, ingest_configmap_path, codespace_name, from_live, ingest_configout_path)

Expand Down
16 changes: 12 additions & 4 deletions docs/src/content/docs/guides/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,14 +25,13 @@ Helm is a package manager for Kubernetes that simplifies the deployment and mana

To deploy Loculus, you'll need to have Helm installed. Helm will be used to manage the dependencies and deploy the Loculus application using the provided Helm chart.


## External Database

By default, the provided Helm chart will create temporary databases for testing and development purposes. These temporary databases are suitable for initial setup and experimentation.

However, for a production deployment, you must use a permanent database. We recommend using a managed database service like Amazon RDS, Google Cloud SQL, or DigitalOcean Managed Databases, or you can run your own database server, but you must not use the built in database for production.

To use an external database, you'll need to provide the necessary connection details, such as the database URL, username, and password.
To use an external database, you'll need to provide the necessary connection details, such as the database URL, username, and password.

These details are configured in the `secrets` section of the `values.yaml` file.

Expand All @@ -53,8 +52,8 @@ secrets:
password: "unsecure"
port: "5432"
```
You can also use sealed secrets, see the [Sealed Secrets](#sealed-secrets) section for more information.
You can also use sealed secrets, see the [Sealed Secrets](#sealed-secrets) section for more information.
## Clone the repository
Expand Down Expand Up @@ -107,7 +106,7 @@ organisms:
displayName: INSDC accession
customDisplay:
type: link
url: "https://www.ncbi.nlm.nih.gov/nuccore/{{value}}"
url: "https://www.ncbi.nlm.nih.gov/nuccore/__value__"
website:
tableColumns:
- country
Expand Down Expand Up @@ -142,9 +141,13 @@ Additionally, the `tableColumns` section defines which metadata fields are shown
You can add multiple organisms under the organisms section, each with its own unique configuration.

## Secrets

Our secrets configuration supports three types of secrets.

### `raw`

This is the simplest type of secret, it is just a key value pair.

```yaml
secrets:
database:
Expand All @@ -154,8 +157,11 @@ secrets:
username: "postgres"
password: "password"
```

### `sealedsecret`

This is a sealed secret, it is encrypted and can only be decrypted by the cluster.

```yaml
secrets:
database:
Expand All @@ -168,7 +174,9 @@ secrets:
```

### `autogen`

This is a secret that is automatically generated by the helm chart.

```yaml
secrets:
secretKey:
Expand Down
3 changes: 2 additions & 1 deletion ingest/Snakefile
Original file line number Diff line number Diff line change
Expand Up @@ -149,9 +149,10 @@ rule get_previous_submissions:
hashes="results/previous_submissions.json",
params:
log_level=LOG_LEVEL,
sleep=config["post_start_sleep"],
shell:
"""
sleep 120 # Run only once keycloak is up and database wiped
sleep {params.sleep}
python scripts/call_loculus.py \
--mode get-submitted \
--config-file {input.config} \
Expand Down
2 changes: 2 additions & 0 deletions ingest/config/defaults.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# Values here are defaults for the `config` variable in the Snakefile
# Purpose is to keep the `values.yaml` config file clean
post_start_sleep: 0
log_level: DEBUG
compound_country_field: ncbi_geo_location
fasta_id_field: genbank_accession
Expand Down Expand Up @@ -32,6 +33,7 @@ keep:
- ncbi_virus_name
- ncbi_virus_tax_id
- sequence_md5
- genbank_accession
all_fields:
- accession
- bioprojects
Expand Down
7 changes: 5 additions & 2 deletions ingest/scripts/prepare_metadata.py
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,10 @@ def main(config_file: str, input: str, sequence_hashes: str, output: str, log_le

# Calculate overall hash of metadata + sequence
for record in metadata:
sequence_hash = sequence_hashes.get(record[config.rename[config.fasta_id_field]], "")
fasta_id_field = config.fasta_id_field
if config.fasta_id_field in config.rename:
fasta_id_field = config.rename[config.fasta_id_field]
sequence_hash = sequence_hashes.get(record[fasta_id_field], "")
if sequence_hash == "":
raise ValueError(f"No hash found for {record[config.fasta_id_field]}")

Expand All @@ -109,7 +112,7 @@ def main(config_file: str, input: str, sequence_hashes: str, output: str, log_le

record["hash"] = hashlib.md5(prehash.encode()).hexdigest()

meta_dict = {rec[config.rename[config.fasta_id_field]]: rec for rec in metadata}
meta_dict = {rec[fasta_id_field]: rec for rec in metadata}

Path(output).write_text(json.dumps(meta_dict, indent=4))

Expand Down
30 changes: 20 additions & 10 deletions kubernetes/loculus/templates/_common-metadata.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -65,9 +65,20 @@ fields:
type: string
notSearchable: true
header: Data Use Terms
customDisplay:
type: link
url: "__value__"
{{- end}}
{{- end}}

{{/* Patches schema by adding to it */}}
{{- define "loculus.patchMetadataSchema" -}}
{{- $patchedSchema := deepCopy . }}
{{- $toAdd := . | dig "metadataAdd" list -}}
{{- $patchedMetadata := concat .metadata $toAdd -}}
{{- set $patchedSchema "metadata" $patchedMetadata | toYaml -}}
{{- end -}}

{{/* Generate website config from passed config object */}}
{{- define "loculus.generateWebsiteConfig" }}
name: {{ quote $.Values.name }}
Expand All @@ -84,9 +95,8 @@ accessionPrefix: {{ quote $.Values.accessionPrefix }}
organisms:
{{- range $key, $instance := (.Values.organisms | default .Values.defaultOrganisms) }}
{{ $key }}:

schema:
{{- with $instance.schema }}
{{- with ($instance.schema | include "loculus.patchMetadataSchema" | fromYaml) }}
instanceName: {{ quote .instanceName }}
loadSequencesAutomatically: {{ .loadSequencesAutomatically | default false }}
{{ if .image }}
Expand All @@ -96,8 +106,7 @@ organisms:
description: {{ quote .description }}
{{ end }}
primaryKey: accessionVersion
inputFields:
{{ $instance.schema.inputFields | toYaml | nindent 8}}
inputFields: {{- include "loculus.inputFields" . | nindent 8 }}
metadata:
{{ $metadata := concat $commonMetadata .metadata
| include "loculus.generateWebsiteMetadata"
Expand All @@ -116,7 +125,7 @@ organisms:
fields:
{{- range . }}
- name: {{ quote .name }}
type: {{ quote .type }}
type: {{ .type | default "string" | quote }}
{{- if .autocomplete }}
autocomplete: {{ .autocomplete }}
{{- end }}
Expand All @@ -140,9 +149,7 @@ fields:
type: {{ quote .customDisplay.type }}
url: {{ .customDisplay.url }}
{{- end }}
{{- if .header }}
header: {{ .header }}
{{- end }}
header: {{ default "Other" .header }}
{{- end}}
{{- end}}

Expand All @@ -159,7 +166,10 @@ organisms:
{{- with $instance.schema }}
instanceName: {{ quote .instanceName }}
metadata:
{{ $metadata := include "loculus.generateBackendMetadata" .metadata | fromYaml }}
{{ $metadata := (include "loculus.patchMetadataSchema" .
| fromYaml).metadata
| include "loculus.generateBackendMetadata"
| fromYaml }}
{{ $metadata.fields | toYaml | nindent 8 }}
{{- end }}
referenceGenomes:
Expand All @@ -172,7 +182,7 @@ organisms:
fields:
{{- range . }}
- name: {{ quote .name }}
type: {{ quote .type }}
type: {{ .type | default "string" | quote }}
{{- if .required }}
required: {{ .required }}
{{- end }}
Expand Down
11 changes: 11 additions & 0 deletions kubernetes/loculus/templates/_ingestRenameFromValues.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
{{- define "loculus.ingestRename" -}}
{{- $metadata := . }}
{{- $ingestRename := dict }}
{{- range $field := $metadata }}
{{- if hasKey $field "ingest" }}
{{- $_ := set $ingestRename (index $field "ingest") (index $field "name") }}
{{- end }}
{{- end }}
{{- $output := dict "rename" $ingestRename }}
{{- toYaml $output }}
{{- end -}}
48 changes: 48 additions & 0 deletions kubernetes/loculus/templates/_inputFieldsFromValues.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
{{- define "loculus.inputFields" -}}
{{- $data := . }}
{{- $metadata := $data.metadata }}
{{- $extraFields := $data.extraInputFields }}
{{- $TO_KEEP := list "name" "displayName" "definition" "guidance" "example" "required" }}


{{- $fieldsDict := dict }}
{{- $index := 0 }}

{{- /* Add fields with position "first" to the dict */}}
{{- range $field := $extraFields }}
{{- if eq $field.position "first" }}
{{- $_ := set $fieldsDict (printf "%03d" $index) $field }}
{{- $index = add $index 1 }}
{{- end }}
{{- end }}

{{- /* Add filtered metadata fields to the dict */}}
{{- range $field := $metadata }}
{{- if not (hasKey $field "noInput") }}
{{- $_ := set $fieldsDict (printf "%03d" $index) $field }}
{{- $index = add $index 1 }}
{{- end }}
{{- end }}

{{- /* Add fields with position "last" to the dict */}}
{{- range $field := $extraFields }}
{{- if eq $field.position "last" }}
{{- $_ := set $fieldsDict (printf "%03d" $index) $field }}
{{- $index = add $index 1 }}
{{- end }}
{{- end }}

{{- /* Iterate over sorted index to get list of values (sorted by key) */}}
{{- $inputFields := list }}
{{- range $k:= keys $fieldsDict | sortAlpha }}
{{- $toAdd := dict }}
{{- range $k, $v := (index $fieldsDict $k) }}
{{- if has $k $TO_KEEP }}
{{- $_ := set $toAdd $k $v }}
{{- end }}
{{- end }}
{{- $inputFields = append $inputFields $toAdd }}
{{- end }}

{{- toYaml $inputFields }}
{{- end -}}
39 changes: 39 additions & 0 deletions kubernetes/loculus/templates/_preprocessingFromValues.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
{{- define "loculus.preprocessingSpecs" -}}
{{- $metadata := . }}
{{- $specs := dict }}

{{- range $field := $metadata }}
{{- $name := index $field "name" }}
{{- $spec := dict "function" "identity" "inputs" (dict "input" $name) }}

{{- if hasKey $field "type" }}
{{- $type := index $field "type" }}
{{- if eq $type "int" }}
{{- $_ := set $spec "args" (dict "type" "int") }}
{{- else if eq $type "float" }}
{{- $_ := set $spec "args" (dict "type" "float") }}
{{- end }}
{{- end }}

{{- if hasKey $field "preprocessing" }}
{{- $preprocessing := index $field "preprocessing" }}
{{- if eq (typeOf $preprocessing) "string" }}
{{- $_ := set $spec "inputs" (dict "input" $preprocessing) }}
{{- else }}
{{- if hasKey $preprocessing "function" }}
{{- $_ := set $spec "function" (index $preprocessing "function") }}
{{- end }}
{{- if hasKey $preprocessing "args" }}
{{- $_ := set $spec "args" (index $preprocessing "args") }}
{{- end }}
{{- if hasKey $preprocessing "inputs" }}
{{- $_ := set $spec "inputs" (index $preprocessing "inputs") }}
{{- end }}
{{- end }}
{{- end }}

{{- $_ := set $specs $name $spec }}
{{- end }}

{{- toYaml $specs }}
{{- end -}}
20 changes: 20 additions & 0 deletions kubernetes/loculus/templates/ingest-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{{- $testconfig := .Values.testconfig | default false }}
{{- $backendHost := .Values.environment | eq "server" | ternary (printf "https://backend-%s" $.Values.host) ($testconfig | ternary "http://localhost:8079" "http://loculus-backend-service:8079") }}
{{- $keycloakHost := .Values.environment | eq "server" | ternary (printf "https://authentication-%s" $.Values.host) ($testconfig | ternary "http://localhost:8083" "http://loculus-keycloak-service:8083") }}
{{- range $key, $values := (.Values.organisms | default .Values.defaultOrganisms) }}
{{- if $values.ingest }}
{{- $metadata := (include "loculus.patchMetadataSchema" $values.schema | fromYaml).metadata }}
---
apiVersion: v1
kind: ConfigMap
metadata:
name: loculus-ingest-config-{{ $key }}
data:
config.yaml: |
{{- $values.ingest.configFile | toYaml | nindent 4 }}
organism: {{ $key }}
backend_url: {{ $backendHost }}
keycloak_token_url: {{ $keycloakHost -}}/realms/loculus/protocol/openid-connect/token
{{- include "loculus.ingestRename" $metadata | nindent 4 }}
{{- end }}
{{- end }}
8 changes: 5 additions & 3 deletions kubernetes/loculus/templates/lapis-silo-database-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,15 +9,16 @@ kind: ConfigMap
metadata:
name: lapis-silo-database-config-{{ $key }}
data:
{{- with $instance.schema }}
{{- with ($instance.schema | include "loculus.patchMetadataSchema" | fromYaml) }}
database_config.yaml: |
schema:
instanceName: {{ .instanceName }}
opennessLevel: OPEN
metadata:
{{- range (concat $commonMetadata .metadata) }}
- name: {{ .name }}
type: {{ (.type | eq "timestamp") | ternary "int" ((.type | eq "authors") | ternary "string" .type) }}
{{- $type := default "string" .type }}
type: {{ ($type | eq "timestamp") | ternary "int" (($type | eq "authors") | ternary "string" $type) }}
{{- if .generateIndex }}
generateIndex: {{ .generateIndex }}
{{- end }}
Expand Down Expand Up @@ -46,5 +47,6 @@ data:
{{ range $importScriptWrapperLines }}
{{ . }}{{ end }}
pangolineage_alias.json: "{}"
pangolineage_alias.json: |
{{ $instance.pangolineage_alias | default dict | toJson }}
{{- end }}
23 changes: 0 additions & 23 deletions kubernetes/loculus/templates/loculus-ingest-config.yaml

This file was deleted.

Loading

0 comments on commit 1b7497a

Please sign in to comment.