-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
138 lines (110 loc) · 5.17 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
---
output: github_document
editor_options:
markdown:
wrap: 72
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE, warn = FALSE, message = FALSE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# PhenotypeR <img src="man/figures/logo.png" align="right" height="180"/>
<!-- badges: start -->
[![CRAN
status](https://www.r-pkg.org/badges/version/PhenotypeR)](https://CRAN.R-project.org/package=PhenotypeR)
[![R-CMD-check](https://github.com/ohdsi/PhenotypeR/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/ohdsi/PhenotypeR/actions/workflows/R-CMD-check.yaml)
[![Lifecycle:Experimental](https://img.shields.io/badge/Lifecycle-Experimental-339999)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
<!-- badges: end -->
The PhenotypeR package helps us to assess the research-readiness of a
set of cohorts we have defined. This assessment includes:
- ***Database diagnostics*** which help us to better understand the
database in which they have been created. This includes information
about the size of the data, the time period covered, the number of
people in the data as a whole. More granular information that may
influence analytic decisions, such as the number of observation
periods per person, is also described.\
- ***Codelist diagnostics*** which help to answer questions like what
concepts from our codelist are used in the database? What concepts
were present led to individuals' entry in the cohort? Are there any
concepts being used in the database that we didn't include in our
codelist but maybe we should have?\
- ***Cohort diagnostics*** which help to answer questions like how
many individuals did we include in our cohort and how many were
excluded because of our inclusion criteria? If we have multiple
cohorts, is there overlap between them and when do people enter one
cohort relative to another? What is the incidence of cohort entry
and what is the prevalence of the cohort in the database?\
- ***Matched diagnostics*** which compares our study cohorts to the
overall population in the database. By matching people in the
cohorts to people with a similar age and sex in the database we can
see how our cohorts differ from the general database population.\
- ***Population diagnostics*** which estimates the frequency of our
study cohorts in the database in terms of their incidence rates and
prevalence.
## Installation
You can install PhenotypeR from CRAN:
```{r, eval = FALSE}
install.packages("PhenotypeR")
```
Or you can install the development version from GitHub:
```{r, eval = FALSE}
# install.packages("remotes")
remotes::install_github("OHDSI/PhenotypeR")
```
## Example usage
To illustrate the functionality of PhenotypeR, let's create a cohort
using the Eunomia Synpuf dataset. We'll first load the required packages and
create the cdm reference for the data.
```{r, message=FALSE, warning=FALSE}
library(dplyr)
library(CohortConstructor)
library(PhenotypeR)
```
```{r, message=FALSE, warning=FALSE}
# Connect to the database and create the cdm object
con <- DBI::dbConnect(duckdb::duckdb(),
CDMConnector::eunomiaDir("synpuf-1k", "5.3"))
cdm <- CDMConnector::cdmFromCon(con = con,
cdmName = "Eunomia Synpuf",
cdmSchema = "main",
writeSchema = "main",
achillesSchema = "main")
```
Note that we've included achilles results in our cdm reference. Where we can we'll use these precomputed counts to speed up our analysis.
```{r, message=TRUE, warning=FALSE}
cdm
```
```{r, message=FALSE, warning=FALSE}
# Create a code lists
codes <- list("warfarin" = c(1310149, 40163554),
"acetaminophen" = c(1125315, 1127078, 1127433, 40229134, 40231925, 40162522, 19133768),
"morphine" = c(1110410, 35605858, 40169988))
# Instantiate cohorts with CohortConstructor
cdm$my_cohort <- conceptCohort(cdm = cdm,
conceptSet = codes,
exit = "event_end_date",
overlap = "merge",
name = "my_cohort")
```
We can easily run all the analyses explained above (**database
diagnostics**, **codelist diagnostics**, **cohort diagnostics**,
**matched diagnostics**, and **population diagnostics**) using
`phenotypeDiagnostics()`:
```{r, message = FALSE}
result <- phenotypeDiagnostics(cdm$my_cohort)
```
Once we have our results we can quickly view them in an interactive
application. Here we'll apply a minimum cell count of 10 to our results and save our shiny app to a temporary directory, but you will likely want to save this shiny app to a local directory of your choice.
```{r, eval=FALSE}
shinyDiagnostics(result = result, minCellCount = 10, directory = tempdir())
```
See the shiny app generated from the example cohort in
[here](https://dpa-pde-oxford.shinyapps.io/Readme_PhenotypeR/).
### More information
To see more details regarding each one of the analyses, please refer to
the package vignettes.