-
Notifications
You must be signed in to change notification settings - Fork 1
/
README.Rmd
123 lines (93 loc) · 4.33 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
---
output:
rmarkdown::github_document
bibliography: "inst/REFERENCES.bib"
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "README-"
)
```
# `sherlock`
<!-- badges: start -->
[![R-CMD-check](https://github.com/Netflix/sherlock/workflows/R-CMD-check/badge.svg)](https://github.com/Netflix/sherlock/actions)
[![Coverage Status](https://img.shields.io/codecov/c/github/Netflix/sherlock/master.svg)](https://codecov.io/github/Netflix/sherlock?branch=master)
[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.5652010.svg)](https://doi.org/10.5281/zenodo.5652010)
<!-- badges: end -->
> Causal Machine Learning for Population Segment Discovery and Analysis
__Authors:__ [Nima Hejazi](https://nimahejazi.org) and [Wenjing
Zheng](https://www.linkedin.com/in/wenjing-zheng/)
---
## Causal Segmentation Analysis with `sherlock`
The `sherlock` R package implements an approach for population segmentation
analysis (or subgroup discovery) using recently developed techniques from causal
machine learning. Using data from randomized A/B experiments or observational
studies (quasi-experiments), `sherlock` takes as input a set of user-selected
candidate segment dimensions -- often, a subset of measured pre-treatment
covariates -- to discover particular segments of the study population based on
the estimated heterogeneity of their response to the treatment under
consideration. In order to quantify this treatment response heterogeneity, the
_conditional average treatment effect_ (CATE) is estimated using a
nonparametric, doubly robust framework [@vanderweele19; @vdL15; @Luedtke16a;
@Luedtke16b], incorporating state-of-the-art ensemble machine learning
[@vdl2007super; @coyle2021sl3] in the estimation procedure.
For background and details on using `sherlock`, see the [package
vignette](https://netflix.github.io/sherlock/articles/sherlock_quick_start_netflix.html)
and the [documentation site](https://netflix.github.io/sherlock/). An overview
of the statistical methodology is available in our [conference
manuscript](https://arxiv.org/abs/2111.01223) [@hejazi2021framework] from [CODE
@ MIT
2021](https://ide.mit.edu/events/2021-conference-on-digital-experimentation-mit-codemit/).
---
## Installation
Install the _most recent version_ from the `master` branch on GitHub via
[`remotes`](https://CRAN.R-project.org/package=remotes):
```{r gh-master-installation, eval = FALSE}
remotes::install_github("Netflix/sherlock")
```
<!--
Eventually, the package will make its way to [CRAN](https://CRAN.R-project.org).
At that point, a stable version may be installed via
```{r cran-stable-installation, eval = FALSE}
install.packages("sherlock")
```
-->
---
## Issues
If you encounter any bugs or have any specific feature requests, please [file an
issue](https://github.com/Netflix/sherlock/issues).
---
## Citation
After using the `sherlock` R package, please cite the following:
@software{netflix2021sherlock,
author={Hejazi, Nima S and Zheng, Wenjing and {Netflix, Inc.}},
title = {{sherlock}: Causal machine learning for segment discovery
and analysis},
year = {2021},
note = {R package version 0.2.0},
doi = {10.5281/zenodo.5652010},
url = {https://github.com/Netflix/sherlock}
}
@article{hejazi2021framework,
author = {Hejazi, Nima S and Zheng, Wenjing and Anand, Sathya},
title = {A framework for causal segmentation analysis with machine
learning in large-scale digital experiments},
year = {2021},
journal = {Conference on Digital Experimentation at {MIT}},
volume = {(8\textsuperscript{th} annual)},
publisher = {MIT Press},
url = {https://arxiv.org/abs/2111.01223}
}
---
## License
The contents of this repository are distributed under the Apache 2.0 license.
See file
[`LICENSE.md`](https://github.com/Netflix/sherlock/blob/master/LICENSE.md) for
details.
---
## References