Cell classification by learning known phenotypes
$ pip install cellgrid
-
Create a schema file.
Cellgrid trains a set of machine learning models in a hierarchical structure in order to classify the cell populations in the same manner. This schema is defined as a list in a json, in which each element contains:- name.
- parent. Name of the parent model.
- model_class_name. Name of the The base model class.
The following options are supported:
- xgb
- random-forest
- linear-regression
- markers. The markers that are used for training the model.
- For example:
[ { "name": "all-events", "parent": null, "model_class_name": "random-forest", "markers": [ "Ce140Di", "Ir191Di" ] }, { "name": "cells", "parent": "all-events", "model_class_name": "xgb", "markers": [ "CD45", "HLA-ABC", "CD57", "CD19", "CD5" ] }, { "name": "CD4T", "parent": "cells", "model_class_name": "xgb", "markers": [ "CD5", "CD4", "CD8a", "CD31", "CD25", "CD3e", "CD7" ] } ]
-
Train a GridClassifier.
from cellgrid.preprocessing import transform from cellgrid.ensemble import GridSchema, GridClassifier #load schema from the json file schema = GridSchema.from_json(path_to_schema) #transform the data x_train = transform(x_train) #train the classifier clf = GridClassifier(schema) clf.fit(x_train, y_train)
-
Score. Return the F1 score of every model.
x_test = transform(x_test) clf.score(x_test, y_test)
-
Predict
x = transform(x) y = clf.predict(x)
-
Save and load
from cellgrid.ensemble import save_model, load_model save_model(clf, path) clf = load_model(path)
GridClassifier(schema)
- schema: See above regarding to schema definition.
fit(x_train, y_train)
Train the classifier
- x_train: The single cell dataset.
- y_train: The labels in a hierarchical structure. An example:
layer1 | layer2 | layer3 |
---|---|---|
cells | B | Naive B |
cells | B | IgD+ Memory B |
cells | CD4T | Central memory CD4T |
non-cells | ||
cells | CD4T |
predict(x)
Predict the hierarchical labels for dataset x
score(x_test, y_test)
Return F1 scores of every model.
from_json(filepath=None)
Load the schema from a json file. See above regarding to schema definition.
MIT
This package was created with
Cookiecutter
and the
audreyr/cookiecutter-pypackage
project template.