README.Rmd

---
output:
  rmarkdown::github_document
bibliography: "inst/REFERENCES.bib"
---

<!-- README.md is generated from README.Rmd. Please edit that file -->

```{r, echo = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)
```

# `sherlock`

<!-- badges: start -->
[![R-CMD-check](https://github.com/Netflix/sherlock/workflows/R-CMD-check/badge.svg)](https://github.com/Netflix/sherlock/actions)
[![Coverage Status](https://img.shields.io/codecov/c/github/Netflix/sherlock/master.svg)](https://codecov.io/github/Netflix/sherlock?branch=master)
[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.5652010.svg)](https://doi.org/10.5281/zenodo.5652010)
<!-- badges: end -->

> Causal Machine Learning for Population Segment Discovery and Analysis

__Authors:__ [Nima Hejazi](https://nimahejazi.org) and [Wenjing
Zheng](https://www.linkedin.com/in/wenjing-zheng/)

---

## Causal Segmentation Analysis with `sherlock`

The `sherlock` R package implements an approach for population segmentation
analysis (or subgroup discovery) using recently developed techniques from causal
machine learning. Using data from randomized A/B experiments or observational
studies (quasi-experiments), `sherlock` takes as input a set of user-selected
candidate segment dimensions -- often, a subset of measured pre-treatment
covariates -- to discover particular segments of the study population based on
the estimated heterogeneity of their response to the treatment under
consideration. In order to quantify this treatment response heterogeneity, the
_conditional average treatment effect_ (CATE) is estimated using a
nonparametric, doubly robust framework [@vanderweele19; @vdL15; @Luedtke16a;
@Luedtke16b], incorporating state-of-the-art ensemble machine learning
[@vdl2007super; @coyle2021sl3] in the estimation procedure.

For background and details on using `sherlock`, see the [package
vignette](https://netflix.github.io/sherlock/articles/sherlock_quick_start_netflix.html)
and the [documentation site](https://netflix.github.io/sherlock/). An overview
of the statistical methodology is available in our [conference
manuscript](https://arxiv.org/abs/2111.01223) [@hejazi2021framework] from [CODE
@ MIT
2021](https://ide.mit.edu/events/2021-conference-on-digital-experimentation-mit-codemit/).

---

## Installation

Install the _most recent version_ from the `master` branch on GitHub via
[`remotes`](https://CRAN.R-project.org/package=remotes):

```{r gh-master-installation, eval = FALSE}
remotes::install_github("Netflix/sherlock")
```

<!--
Eventually, the package will make its way to [CRAN](https://CRAN.R-project.org).
At that point, a stable version may be installed via

```{r cran-stable-installation, eval = FALSE}
install.packages("sherlock")
```
-->

---

## Issues

If you encounter any bugs or have any specific feature requests, please [file an
issue](https://github.com/Netflix/sherlock/issues).

---

## Citation

After using the `sherlock` R package, please cite the following:

        @software{netflix2021sherlock,
          author={Hejazi, Nima S and Zheng, Wenjing and {Netflix, Inc.}},
          title = {{sherlock}: Causal machine learning for segment discovery
            and analysis},
          year  = {2021},
          note = {R package version 0.2.0},
          doi = {10.5281/zenodo.5652010},
          url = {https://github.com/Netflix/sherlock}
        }

        @article{hejazi2021framework,
          author = {Hejazi, Nima S and Zheng, Wenjing and Anand, Sathya},
          title = {A framework for causal segmentation analysis with machine
            learning in large-scale digital experiments},
          year = {2021},
          journal = {Conference on Digital Experimentation at {MIT}},
          volume = {(8\textsuperscript{th} annual)},
          publisher = {MIT Press},
          url = {https://arxiv.org/abs/2111.01223}
        }

---

## License

The contents of this repository are distributed under the Apache 2.0 license.
See file
[`LICENSE.md`](https://github.com/Netflix/sherlock/blob/master/LICENSE.md) for
details.

---

## References