This library provides functionality for processing and transforming experimental data, particularly focused on normalizing and analyzing reagent readouts from Cytometry by time of flight (CyTOF).
Its intended to assist in the design of immune assay panels, by providing a structured way to find combinations of cell types, stimuli and reagent readouts that are most informative because they show a robust response, wide variance accross a patient population and low correlation with the other selected combinations.
filter_by_group(df, by_filter_columns)
: Filter dataframe rows matching specified column valuesfilter_by_group_negate(df, by_filter_columns)
: Filter dataframe rows NOT matching specified column valuesfilter_data(df, initial_filters)
: Filter data and remove NaN values
remove_outliers(df, by_grouping_columns, num_std_dev)
: Remove outliers based on standard deviation within groupsnormalize_by_basal(df, basal_filters, normalization_join)
: Normalize values by subtracting baseline measurementsgroup_by_and_agg(df, group_by)
: Group data and calculate median and variance statistics
response_and_variance_transform()
combines the above functions into a complete pipeline:
- Filters initial data
- Normalizes against baseline measurements
- Removes outliers
- Calculates group statistics
You can run the pytest with the following command:
make test
You can build a docker image with the following command:
make docker-build
pull the latest image from docker hub with the following command:
docker pull ludflu/i3h-response-and-variance
This work came out of the Immune Atlas Hackathon Team at the The Immune Health Hackathon 2025. Sponsored by:
- The Colton Consortium
- The Institute for Immunology and Immune Health (I3H)
- Penn Institute for Biomedical Informatics
- Seljuq Haider
- Kelvin Koser
- Jen Shi
- Jim Snavely
- Kevin Wang
- Charles Zheng