Skip to content

Scalability: 2000 Plasma samples #168

Draft
lucas-diedrich wants to merge 5 commits intomainfrom
hupo-psi-demo-1_plasma2000
Draft

Scalability: 2000 Plasma samples #168
lucas-diedrich wants to merge 5 commits intomainfrom
hupo-psi-demo-1_plasma2000

Conversation

@lucas-diedrich
Copy link
Collaborator

@lucas-diedrich lucas-diedrich commented Feb 3, 2026

Demo for HUPO-PSI @vbrennsteiner


Summary: Here, we demonstrate the scalability of our pipeline by loading MS proteomics data on various feature levels (precursor > proteins > genes) from a DIANN-tsv PSM report of 2000 plasma samples into memory, leveraging alphapepttools. Further investigation by @vbrennsteiner shows that I/O times show significant speed ups when storing the data in serialized h5ad files compared to .tsv.

@lucas-diedrich lucas-diedrich added the documentation Improvements or additions to documentation label Feb 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants