From 9363697aa3aa7e3c002805507f8fd67c9e1a7cde Mon Sep 17 00:00:00 2001 From: Stef Piatek Date: Wed, 13 Dec 2023 09:36:29 +0000 Subject: [PATCH] Set orthanc raw maximum storage in Dockerfile (#179) * Set orthanc raw maximum storage in Dockerfile * Orthanc raw defaults to no limit on storage Saves us from faffing around with build args for other tests * Format README * Document `ORTHANC_RAW_MAXIMUM_STORAGE_SIZE` * Add simple test for `ORTHANC_RAW_MAXIMUM_STORAGE_SIZE` --------- Co-authored-by: Milan Malfait --- .env.sample | 2 +- README.md | 55 +++++++++++++++---- docker-compose.yml | 2 +- docker/orthanc-raw/Dockerfile | 7 ++- orthanc/orthanc-raw/config/orthanc.json | 2 +- test/run-system-test.sh | 1 + .../check_max_storage_in_orthanc_raw.sh | 21 +++++++ 7 files changed, 75 insertions(+), 15 deletions(-) create mode 100755 test/scripts/check_max_storage_in_orthanc_raw.sh diff --git a/.env.sample b/.env.sample index d190fba6f..f1f60dcae 100644 --- a/.env.sample +++ b/.env.sample @@ -39,7 +39,7 @@ ORTHANC_RAW_USERNAME= ORTHANC_RAW_PASSWORD= ORTHANC_RAW_AE_TITLE= ORTHANC_AUTOROUTE_RAW_TO_ANON=true -ORTHANC_RAW_MAXIMUM_STORAGE_SIZE= +ORTHANC_RAW_MAXIMUM_STORAGE_SIZE= // MB # PIXL Orthanc anon instance ORTHANC_ANON_USERNAME= diff --git a/README.md b/README.md index 0dd1a786d..537477ba3 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,5 @@ # PIXL + PIXL Image eXtraction Laboratory `PIXL` is a system for extracting, linking and de-identifying DICOM imaging data, structured EHR data and free-text data from radiology reports at UCLH. @@ -8,62 +9,92 @@ PIXL is intended run on one of the [GAE](https://github.com/UCLH-Foundry/Book-of several services orchestrated by [Docker Compose](https://docs.docker.com/compose/). ## Services + ### [PIXL CLI](./cli/README.md) + Primary interface to the PIXL system. + ### [Hasher API](./hasher/README.md) + HTTP API to securely hash an identifier using a key stored in Azure Key Vault. + ### [Orthanc Raw](./orthanc/orthanc-raw/README.md) + A DICOM node which receives images from the upstream hospital systems and acts as cache for PIXL. + ### [Orthanc Anon](./orthanc/orthanc-anon/README.md) + A DICOM node which wraps our de-identifcation and cloud transfer components. + ### PostgreSQL + RDBMS which stores DICOM metadata, application data and anonymised patient record data. + ### [Electronic Health Record Extractor](./pixl_ehr/README.md) -HTTP API to process messages from the `ehr` queue and populate raw and anon tables in the PIXL postgres instance. + +HTTP API to process messages from the `ehr` queue and populate raw and anon tables in the PIXL postgres instance. + ### [PACS Image Extractor](./pixl_pacs/README.md) -HTTP API to process messages from the `pacs` queue and populate the raw orthanc instance with images from PACS/VNA. + +HTTP API to process messages from the `pacs` queue and populate the raw orthanc instance with images from PACS/VNA. ## Setup ### 0. Choose deployment environment + This is one of dev|test|staging|prod and referred to as `` in the docs. ### 1. Initialise environment configuration + Create a local `.env` and `pixl_config.yml` file in the _PIXL_ directory: + ```bash cp .env.sample .env && cp pixl_config.yml.sample pixl_config.yml ``` + Add the missing configuration values to the new files: #### Environment + Set `ENV` to ``. #### Credentials + - `EMAP_DB_`* UDS credentials are only required for `prod` or `staging` deployments of when working on the EHR & report retriever component. -You can leave them blank for other dev work. +You can leave them blank for other dev work. - `PIXL_DB_`* -These are credentials for the containerised PostgreSQL service and are set in the official PostgreSQL image. +These are credentials for the containerised PostgreSQL service and are set in the official PostgreSQL image. Use a strong password for `prod` deployment but the only requirement for other environments is consistency as several services interact with the database. - `PIXL_EHR_API_AZ_`* These credentials are used for uploading a PIXL database to Azure blob storage. They should be for a service principal that has `Storage Blob Data Contributor` on the target storage account. The storage account must also allow network access from the PIXL host machine. #### Ports + Most services need to expose ports that must be mapped to ports on the host. The host port is specified in `.env` Ports need to be configured such that they don't clash with any other application running on that GAE. +#### Storage size + +The maximum storage size of the `orthanc-raw` instance can be configured through the `ORTHANC_RAW_MAXIMUM_STORAGE_SIZE` +environment variable in `.env`. This limits the storage size to the specified value (in MB). When the storage is full +[Orthanc will automatically recycle older studies in favour of new ones](https://orthanc.uclouvain.be/book/faq/features.html#id8). ## Run ### Start + From the _PIXL_ directory: + ```bash bin/pixldc pixl_dev up ``` ### Stop + From the _PIXL_ directory: + ```bash bin/pixldc pixl_dev down ``` @@ -71,29 +102,31 @@ bin/pixldc pixl_dev down ## Analysis The number of DICOM instances in the raw Orthanc instance can be accessed from -`http://:/ui/app/#/settings` and similarly with +`http://:/ui/app/#/settings` and similarly with the Orthanc Anon instance, where `pixl_host` is the host of the PIXL services and `ORTHANC_RAW_WEB_PORT` is defined in `.env`. -The number of reports and EHR can be interrogated by connecting to the PIXL -database with a database client (e.g. [DBeaver](https://dbeaver.io/)), using -the connection parameters defined in `.env`. For example, to find the number of +The number of reports and EHR can be interrogated by connecting to the PIXL +database with a database client (e.g. [DBeaver](https://dbeaver.io/)), using +the connection parameters defined in `.env`. For example, to find the number of non-null reports ```sql select count(*) from emap_data.ehr_anon where xray_report is not null; ``` - ## Develop -See each service's README for instructions for individual developing and testing instructions. + +See each service's README for instructions for individual developing and testing instructions. For Python development we use [isort](https://github.com/PyCQA/isort) and [black](https://black.readthedocs.io/en/stable/index.html) alongside [pytest](https://www.pytest.org/). There is support (sometimes through plugins) for these tools in most IDEs & editors. Before raising a PR, **run the full test suite** from the _PIXL_ directory with + ```bash bin/run-all-tests.sh ``` -and not just the component you have been working on as this will help us catch unintentional regressions without spending GH actions minutes :-) + +and not just the component you have been working on as this will help us catch unintentional regressions without spending GH actions minutes :-) We run [pre-commit](https://pre-commit.com/) as part of the GitHub Actions CI. To install and run it locally, do: diff --git a/docker-compose.yml b/docker-compose.yml index 57ea53e4f..e1d27adae 100644 --- a/docker-compose.yml +++ b/docker-compose.yml @@ -157,6 +157,7 @@ services: dockerfile: ./docker/orthanc-raw/Dockerfile args: <<: *build-args-common + ORTHANC_RAW_MAXIMUM_STORAGE_SIZE: ${ORTHANC_RAW_MAXIMUM_STORAGE_SIZE} command: /run/secrets environment: <<: [*pixl-db, *proxy-common, *pixl-common-env] @@ -182,7 +183,6 @@ services: - type: volume source: orthanc-raw-data target: /var/lib/orthanc/db - - ${PWD}/orthanc/orthanc-raw/config:/run/secrets:ro networks: - pixl-net depends_on: diff --git a/docker/orthanc-raw/Dockerfile b/docker/orthanc-raw/Dockerfile index 25b9d7c25..e8ad61be2 100644 --- a/docker/orthanc-raw/Dockerfile +++ b/docker/orthanc-raw/Dockerfile @@ -14,4 +14,9 @@ FROM osimis/orthanc:22.9.0-full-stable SHELL ["/bin/bash", "-o", "pipefail", "-e", "-u", "-x", "-c"] -COPY ./orthanc/orthanc-raw/plugin/pixl.py /etc/orthanc/pixl.py \ No newline at end of file +ARG ORTHANC_RAW_MAXIMUM_STORAGE_SIZE + +COPY ./orthanc/orthanc-raw/plugin/pixl.py /etc/orthanc/pixl.py +# Orthanc can't substitute environment veriables as integers so copy and replace before running +COPY ./orthanc/orthanc-raw/config /run/secrets +RUN sed -i "s/\${ORTHANC_RAW_MAXIMUM_STORAGE_SIZE}/${ORTHANC_RAW_MAXIMUM_STORAGE_SIZE:-0}/g" /run/secrets/orthanc.json \ No newline at end of file diff --git a/orthanc/orthanc-raw/config/orthanc.json b/orthanc/orthanc-raw/config/orthanc.json index 1fa19f7e5..ab6d86585 100644 --- a/orthanc/orthanc-raw/config/orthanc.json +++ b/orthanc/orthanc-raw/config/orthanc.json @@ -12,7 +12,7 @@ // Limit the maximum storage size "MaximumPatientCount" : 0, // no limit - "MaximumStorageSize" : ${ORTHANC_RAW_MAXIMUM_STORAGE_SIZE}, // MB + "MaximumStorageSize" : ${ORTHANC_RAW_MAXIMUM_STORAGE_SIZE}, // MB, replaced in Dockerfile because its an integer "MaximumStorageMode" : "Recycle", diff --git a/test/run-system-test.sh b/test/run-system-test.sh index 375846c0b..75e7b44ae 100755 --- a/test/run-system-test.sh +++ b/test/run-system-test.sh @@ -31,6 +31,7 @@ pixl start sleep 65 # need to wait until the DICOM image is "stable" = 60s ./scripts/check_entry_in_pixl_anon.sh ./scripts/check_entry_in_orthanc_anon.sh +./scripts/check_max_storage_in_orthanc_raw.sh cd "${PACKAGE_DIR}" docker compose -f docker-compose.yml -f ../docker-compose.yml -p test down diff --git a/test/scripts/check_max_storage_in_orthanc_raw.sh b/test/scripts/check_max_storage_in_orthanc_raw.sh new file mode 100755 index 000000000..abd01a5f9 --- /dev/null +++ b/test/scripts/check_max_storage_in_orthanc_raw.sh @@ -0,0 +1,21 @@ +#!/bin/bash +# Copyright (c) University College London Hospitals NHS Foundation Trust +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +set -eux pipefail + +# This could be much improved by having more realistic test data some of +# which actually was persisted +source ./.env.test +docker logs test-orthanc-raw-1 2>&1 | grep "At most ${ORTHANC_RAW_MAXIMUM_STORAGE_SIZE}MB will be used for the storage area" +