CellGrid

Cell classification by learning known phenotypes

Install

$ pip install cellgrid

Get Started

Create a schema file.
Cellgrid trains a set of machine learning models in a hierarchical structure in order to classify the cell populations in the same manner. This schema is defined as a list in a json, in which each element contains:

name.
parent. Name of the parent model.
model_class_name. Name of the The base model class. The following options are supported:
- xgb
- random-forest
- linear-regression
markers. The markers that are used for training the model.
For example:

[
    {
        "name": "all-events",
        "parent": null,
        "model_class_name": "random-forest",
        "markers": [
            "Ce140Di",
            "Ir191Di"
        ]
    },
    {
        "name": "cells",
        "parent": "all-events",
        "model_class_name": "xgb",
        "markers": [
            "CD45",
            "HLA-ABC",
            "CD57",
            "CD19",
            "CD5"
        ]
    },
    {
        "name": "CD4T",
        "parent": "cells",
        "model_class_name": "xgb",
        "markers": [
            "CD5",
            "CD4",
            "CD8a",
            "CD31",
            "CD25",
            "CD3e",
            "CD7"
        ]
    }
]

Train a GridClassifier.

from cellgrid.preprocessing import transform
from cellgrid.ensemble import GridSchema, GridClassifier    

#load schema from the json file
schema = GridSchema.from_json(path_to_schema)
#transform the data
x_train = transform(x_train)
#train the classifier
clf = GridClassifier(schema)
clf.fit(x_train, y_train)

Score. Return the F1 score of every model.

x_test = transform(x_test)
clf.score(x_test, y_test)

Predict
```
x = transform(x)
y = clf.predict(x)
```

Save and load

from cellgrid.ensemble import save_model, load_model
 
save_model(clf, path)
clf = load_model(path)

API

GridClassifier

Constructor

GridClassifier(schema)

Arguments

schema: See above regarding to schema definition.

Methods

fit

fit(x_train, y_train)

Train the classifier

Arguments

x_train: The single cell dataset.
y_train: The labels in a hierarchical structure. An example:

layer1	layer2	layer3
cells	B	Naive B
cells	B	IgD+ Memory B
cells	CD4T	Central memory CD4T
non-cells
cells	CD4T

predict

predict(x)

Predict the hierarchical labels for dataset x

score

score(x_test, y_test)

Return F1 scores of every model.

GridSchema

Methods

from_json

from_json(filepath=None)

Load the schema from a json file. See above regarding to schema definition.

License

MIT

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
.github		.github
cellgrid		cellgrid
docs		docs
tests		tests
.editorconfig		.editorconfig
.gitignore		.gitignore
.travis.yml		.travis.yml
AUTHORS.rst		AUTHORS.rst
CONTRIBUTING.rst		CONTRIBUTING.rst
HISTORY.rst		HISTORY.rst
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
requirements_dev.txt		requirements_dev.txt
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CellGrid

Install

Get Started

API

GridClassifier

Constructor

Arguments

Methods

fit

Arguments

predict

score

GridSchema

Methods

from_json

License

Credits

About

Releases

Packages

Languages

License

cogentherapeutics/cellgrid

Folders and files

Latest commit

History

Repository files navigation

CellGrid

Install

Get Started

API

GridClassifier

Constructor

Arguments

Methods

fit

Arguments

predict

score

GridSchema

Methods

from_json

License

Credits

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages