This document explains how to reproduce the HeiPorSPECTRAL
dataset including data generation, preprocessing (intermediates) and figure computation.
Please make sure that you have the htc package installed (it contains all required dependencies) and that it works on your machine (see README in the repository root). It is advisable to run all commands in a screen environment (e.g. screen
) as they may take a while to complete.
The HeiPorSPECTRAL dataset is generated based on our (internal) full version of the tissue atlas. Please make sure that you set the path to the masks and studies dataset correctly and then run:
htc dataset_open_atlas --output-path /mnt/nvme_4tb/HeiPorSPECTRAL
This also generates all the intermediate files and uploads the zip archive.
The following steps need access to the new (or downloaded) dataset. Therefore, please adjust your environment variables (according to the README) so that no environment variables for the network drive are set and that no other dataset is registered (to ensure that the scripts really only use the HeiPorSPECTRAL dataset), e.g. via the following .env
:
export PATH_Tivita_HeiPorSPECTRAL=/mnt/nvme_4tb/HeiPorSPECTRAL
export PATH_HTC_RESULTS=~/htc/results
# DKFZ internal only
export PATH_E130_Projekte=""
With these settings, the generated files will be stored in ~/htc/results/open_data
.
To generate the label profile images (similar to the profile images in the intermediates directory but with aggregated data) per image, simply run
htc label_profiles
This will generate a PDF per label.
The PCA and UMAP figures of the paper can be generated by running the DataVisualizations.ipynb
notebook:
jupyter nbconvert --to html --execute --stdout ~/htc/src/paper/NatureData2023/DataVisualizations.ipynb > /dev/null
This will create PDF and HTML files for all PCA and UMAP visualizations.
The colorchecker comparison between the Tivita camera and the spectrometer can be generated by running the TechnicalValidation.ipynb
notebook:
jupyter nbconvert --to html --execute --stdout ~/htc/src/paper/NatureData2023/TechnicalValidation.ipynb > /dev/null
This will generate a PDF and a HTML file with the colorchecker figure.
To generate the example gif file which is shown in the README of the dataset, you can run
htc readme_gif
This will generate a GIF and a PNG file.