GitHub - bcebere/elastic-surv: Survival analysis for Big Data

elastic-surv

Survival analysis on Big Data

elastic-surv is a library for training risk estimation models on ElasticSearch backends. Potential use cases include user churn prediction or survival probability.

🔑 Survival models include CoxPH, DeepHit or LogisticHazard(pycox).
🔥 ElasticSearch support using eland.
🌀 Automatic model selection using HyperBand.

Problem formulation

Risk estimation tasks require:

A set of covariates/features(X).
An outcome/event column(Y) - 0 means right censoring, 1 means that the event occured.
Time to event column(T) - the duration until the event or the censoring occured.

The risk estimation task output is a survival function: for N time horizons, it outputs the probability of "survival"(event not occurring) at each horizon.

Installation

For configuring the ELK stack, please follow the instructions here.

The library can be installed using

$ pip install .

Sample Usage

For each ElasticSearch data backend, we need to mention:

the es_index_pattern and the es_client for the ES connection.
which keys in the ES index stand for the time-to-event and outcome data.
optional: which features to include from the index.

from elastic_surv.dataset import ESDataset
from elastic_surv.models import CoxPHModel

dataset = ESDataset(
    es_index_pattern = 'churn-prediction',
    time_column = 'months_active',
    event_column = 'churned',
    es_client = "localhost",
)

model = CoxPHModel(in_features = dataset.features())
    
model.train(dataset)
model.score(dataset)

For this example, we use a local ES index, churn-prediction. This can be generated using the following snippet

from pysurvival.datasets import Dataset
import eland as ed

raw_dataset = Dataset('churn').load() 

ed.pandas_to_eland(raw_dataset,
                  es_client='localhost',
                  es_dest_index='churn-prediction',
                  es_if_exists='replace',
                  es_dropna=True,
                  es_refresh=True,
)

Tutorials

Tests

Install the testing dependencies using

pip install .[testing]

The tests can be executed using

pytest -vsx

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
src/elastic_surv		src/elastic_surv
tests		tests
tutorials		tutorials
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

elastic-surv

Survival analysis on Big Data

Problem formulation

Installation

Sample Usage

Tutorials

Tests

About

Releases

Packages

Languages

License

bcebere/elastic-surv

Folders and files

Latest commit

History

Repository files navigation

elastic-surv

Survival analysis on Big Data

Problem formulation

Installation

Sample Usage

Tutorials

Tests

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages