Graph perceiver network for lung tumor and premalignant lesion stratification from histopathology

This work is published in the [American Journal of Pathology].

@article {Gindra2024,
	author = {Rushin H. Gindra and Yi Zheng and Emily J. Green and Mary E. Reid and Sarah A. Mazzilli and Daniel T. Merrick and Eric J. Burks and Vijaya B. Kolachalama and Jennifer E. Beane},
	title = {Graph perceiver network for lung tumor and bronchial premalignant lesion stratification from histopathology},
	year = {2024},
	doi = {10.1016/j.ajpath.2024.03.009},
	URL = {https://doi.org/10.1016/j.ajpath.2024.03.009},
	journal = {American Journal of Pathology}
}

Key Ideas & Main Findings

We hypothesize that computational methods can help capture tissue heterogeneity in histology whole slide images (WSIs) and stratify PMLs by histologic severity or their ability to progress to invasive carcinoma, providing an informative pipeline for assessment of premalignant lesions. Graph Perceiver Networks is a generalized architecture that integrates the graph module with the perceiver architecture, enabling sparse graph computations on visual tokens and computationally efficient modeling of the perceiver. The architecture reduces the computational footprint significantly compared to state-of-the-art WSI analysis architectures, thus allowing for extremely large WSIs to be processed efficiently.

As a bonus, the architecture is explainable and can be trained on large batch-sizes without the extreme computational head, thus making it a suitable candidate for further academic research lab centric projects. Based on Pytorch and Pytorch-Geometric.

Updates / TODOs

Please follow this GitHub for more updates.

[] Remove dead code in Repository
[] Pre-requisites and installations (Conda env & docker container)
[] Add data downlaod + preprocessing steps (python file)
[] Also add data tree structure (For easy understanding)
[] Add pretrained model weights + instructions for training & evaluation (python files)
[] Add code for K-NN evaluation (jupyter notebook)
[] Add code for visualization (jupyter notebook)
[] Explanatory Heatmaps
Contact for Issues
Acknowledgements, License & Usage

Pre-requisites and Installations

Conda installation and Potentially a docker container at some point.

Data Download and Preprocessing

Data Download

Resections

TCGA-[LUAD|LUSC]: To download Tissue Slides WSIs (formatted as .svs files) and associated clinical metadata, please refer to the NIH Genomic Data Commons Data Portal. WSIs for each cancer type can be downloaded using the GDC Data Transfer Tool.
CPTAC-[LUAD|LSCC]: To download the WSIs (formatted as .svs files) from the discovery cohort, and the associated clinical metaata, please refer to the The Cancer Imaging Archive Portal. WSIs from CPTAC can be downloaded using TCIA_Utils

Biopsies

UCL: Lung biopsy samples from University College London. To download the biopsy WSIs (formatted as .ndpi) and associated clinical metadata, please refer to the Imaging Data Resources repository, IDR 0082. WSIs from repository can be downloaded using Aspera protocol
Roswell: Lung biopsy samples from Roswell park comprehensive cancer institute.

Example Directory

└──TCGA_ROOT_DIR/
	└──	TCGA_train.txt
	└──	TCGA_test.txt
	└──	TCGA_plot.txt
	└── WSIs/
		├── slide_1.svs
		├── slide_2.svs
		└── ...
	└── ctranspath_pt_features/
		└── slide_1/
			├── adj_s_ei.pt
			├── adj_s.pt
			├── c_idx.txt
			├── edge_attr.pt
			└── features.pt
		└── slide_2/
		└── ...		
	└── patches256
		└── slide_1/
			├── 20.0
				├── x_y.png
				└── ...
		└── slide_2
		└── ...
└──CPTAC_ROOT_DIR/
	└──	CPTAC_test.txt
	└──	CPTAC_plot.txt
	└── WSIs/
	└── ctranspath_pt_features/
	└── patches256
└── ...

Each data cohort is organized as its own folder in [TCGA|CPTAC|UCL|Roswell]_ROOT_DIR.

For preprocessing (patching, feature extraction and graph construction), see preprocessing/graph_construction.py

You can train your model on a multi-centric dataset with the following k-fold cross validation (k=5) scheme where -- (train) ** (val), and ## (test).

[--|--|--|**|##]
[--|--|**|##|--]
[--|**|##|--|--]
[**|##|--|--|--]
[##|--|--|--|**]

Pretrained Model Weights and Training Instructions

Models were trained for 30 epochs with a batch size of 8 using 5-fold cross-validation. These models were evaluated using an internal TCGA test set per fold and the CPTAC external test set.

GPU Hardware used for training: Nvidia GeForce RTX 2080ti - 11GB.

Note: Ideally, longer training with larger batch sizes would demonstrate larger gains in the models performance.

Links to download pretrained model weights.

Arch	SSL Method	Dataset	Epochs	Cross-Attn-Nodes	Performance(Acc)	Download
Graph Perceiver Network	CTransPath	TCGA	30	200	N/A	N/A
Graph Perceiver Network	SimCLR-Lung	NLST-TMA	30	200	N/A	N/A

Instructions for training and evaluating the models (include python files).

Evaluation and Testing

TCGA (internal test set)
CPTAC (external test set)
K-NN evaluation: Description of the K-NN evaluation process. Link or embedded Jupyter notebook for K-NN evaluation

Explanatory Heatmaps

Details about the visualization techniques used.
Link or embedded Jupyter notebook for visualization.

Contact for Issues

Please open new threads or report issues directly (for urgent blockers) to rushin.gindra@helmholtz-munich.de Immediate response to minor issues may not be available.

Acknowledgements, License, and Usage

Credits and acknowledgements.
License information.
Usage guidelines and restrictions.

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
configs		configs
dataloaders		dataloaders
figures		figures
graph_builder		graph_builder
jupyter-notebooks		jupyter-notebooks
scripts		scripts
.gitignore		.gitignore
README.md		README.md
aggr.py		aggr.py
class_specific_CAMs.ipynb		class_specific_CAMs.ipynb
conv.py		conv.py
evaluate.py		evaluate.py
gnn.py		gnn.py
layer_cams.py		layer_cams.py
main_pyg.py		main_pyg.py
plot.py		plot.py
test.py		test.py
util.py		util.py
visualize.py		visualize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Graph perceiver network for lung tumor and premalignant lesion stratification from histopathology

Table of Contents

Updates / TODOs

Pre-requisites and Installations

Data Download and Preprocessing

Data Download

Pretrained Model Weights and Training Instructions

Evaluation and Testing

Explanatory Heatmaps

Contact for Issues

Acknowledgements, License, and Usage

About

Releases

Packages

Contributors 3

Languages

vkola-lab/ajpa2024

Folders and files

Latest commit

History

Repository files navigation

Graph perceiver network for lung tumor and premalignant lesion stratification from histopathology

Table of Contents

Updates / TODOs

Pre-requisites and Installations

Data Download and Preprocessing

Data Download

Pretrained Model Weights and Training Instructions

Evaluation and Testing

Explanatory Heatmaps

Contact for Issues

Acknowledgements, License, and Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages