forked from pw2/Data-Analysis-Framework
-
Notifications
You must be signed in to change notification settings - Fork 0
/
PWard - Data Analysis Framework.Rmd
75 lines (48 loc) · 2.75 KB
/
PWard - Data Analysis Framework.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
---
title: "Data Analysis Framework"
author: "Patrick Ward"
date: "12/14/2019"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
# Data Analysis Framework
> The objective is to provide an easy to follow framework for analysis that allows the analyst or PhD student to walk through their entire approach in a step-by-step manner with their colleagues or PhD supervisors.
> In each of the 6-steps, the researcher is to provide their code, outputs for the code, and any comments/thoughts that may help the reader interpret the approach that was taken.
## Step 1: Research Question/Problem Statement
1) Research Question/Problem Statement #1
2) Research Question/Problem Statement #2
##### NOTES:
Provide any additional thoughts around the research question
- Hypothesis
- Potential issues or challenges
- Potential limitations
- Etc.
## Step 2: Data Collection/Measurement Strategy
1) What type of data is required
- Data sources (database, websites, data collection, etc.)
- Structure of the data
- Data issues (missing data, messy data, etc.)
2) Collection/Measurement
- If data needs to be collected, what are the measurements being taken (clearly define the procedures and standardization)?
- Are the measurements valid and reliable (is there potentially a need to add a research step here and conducting your own validity/reliability study before proceeding)?
3) Data Cleaning
- What pre-processing steps were taken?
- Clearly walk through the data cleanning process.
- Is any data missing, if so how much?
- Describe any imputation process for missing data.
- If any data was removed prior to analysis explain why.
## Step 3: Visualize & Summarize Data
- Once data has been collected and cleaned, provide an overview of the data using summarize statistics and visuals.
- Offer interpretation of visuals that may help guide the model building process or generate discussion about any underlying trends in the data specific to the research question.
## Step 4: Model Development/Interpretation
- Iteratively build models (simple to complex).
- Interpret the results of each model to explain why a more complex model or different modelling strategy may be required.
## Step 5: Model Evaluation
- Evaluation the final model(s), describing model errors, model accuracy, residuals, assumptions, etc.
## Step 6: Communication of Results
- Communicate the results of the final model(s) in a clear manner using visualizations and language that is understandable to the end user.
- Explain whether or not the research question has been answered.
- Clearly discuss any limitations of the analysis.
- Offer suggestions for future analysis or perhaps other data sets that may be incorporated to provide a more contextual answer to the research question.