A collection of helper functions for working with UK Biobank data, focussing on the linked primary care EHR data.
Install from GitHub using:
devtools::install_github("philipdarke/ukbbhelpr", dependencies = TRUE)
The main functionality is summarised below. See the manual for more details.
Functions for use with data collected at UK Biobank assessment centre visits start visit_
. Functions for use with primary care EHR data start ehr_
.
Extracts observations/test results from linked EHR data (ehr_data
) from records matching the provided read_codes
. Data is extracted from the value1
field with the exception of data provider 2 where values are extracted from value2
if value1
is empty. Units are taken from value3
for data provider 2 (otherwise units are unavailable). NA
, zero and duplicate values are dropped.
Extracts all instances/arrays of a UK Biobank field(s) from visit_data
. See https://biobank.ndph.ox.ac.uk/showcase/ to identify field
codes.
Extracts self-reported non-cancer medical history from visit_data
in a "long" format that is easier to work with than "wide" as provided by UK Biobank.
Extracts self-reported cancer history from visit_data
in a "long" format that is easier to work with than "wide" as provided by UK Biobank.
Determines presence of a specified condition
in the self-reported family history data (visit_data
). If multiple history fields
are provided (e.g. history of mother and father), presence of the condition in any field determines a positive family history.
Made available under the MIT Licence.