-
Notifications
You must be signed in to change notification settings - Fork 0
/
Inferential Analysis of Tooth Growth.Rmd
161 lines (134 loc) · 7.13 KB
/
Inferential Analysis of Tooth Growth.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
---
title: "Inferential Analysis of Tooth Growth"
author: "Alex Fennell"
output: html_document
---
# Synopsis
The purpose of this analysis is to examine the Tooth Growth data set and perform
some basic inferential statistics on it. The first step of this analysis is an
exploratory data analysis. This will be followed by an inferential analysis to test
whether supplement delivery type and doseage levels influence odontoblast (tooth)
growth in guinea pigs. Code is contained in the appendix following the report.
```{r document options,echo=FALSE}
knitr::opts_chunk$set(echo=FALSE)
```
```{r libraries, message=FALSE,warning=FALSE}
library(dplyr)
library(ggplot2)
```
# Exploratory Data Analysis
## Reading in the data
The Tooth Growth data set is already available as it is part of the R datasets
package. Using the str function, the basic structure of the dataset can be seen.
```{r tooth data exploration}
TGdata<-ToothGrowth
str(TGdata)
```
### Summary Statistics
This shows the dataset with 3 variables, len-length of tooth growth,
supp-the supplement type, and dose-the numeric does in mg per day. The first thing
we can do is change dose to a factor variable and then look at various summary statistics
as a function of the two factor variables, dose and supplement type.
```{r initial data summary,message=FALSE,warning=FALSE}
# Need to change dose column into a factor rather than a numeric variable
TGdata$dose<-as.factor(TGdata$dose)
Summary_TGdata<-TGdata %>%
group_by(supp,dose)%>%
summarise(MEAN=mean(len),SD=sd(len),MAX=max(len),MIN=min(len),COUNT=n())
Summary_TGdata
```
This output shows that mean length of odontoblasts increases with dose across both
supplement types. The maximum and minimum values also increase as a funciton of
increasing dosage. The sample size is the same across all supplement type and
dose manipulations. It is difficult to tell if the mean odontoblast length differs
across supplement types, so in the next block of code, we will examine this.
```{r tooth data mean comparison}
suppcomp<-TGdata %>%
group_by(supp)%>%
summarise(MEAN=mean(len),SD=sd(len),MAX=max(len),MIN=min(len))
suppcomp
```
This shows that on average the mean odntoblast length for guinea pigs given
orange juice is longer than guinea pigs given ascorbic acid. In addition to this,
the maximum and minimum values for guinea pigs in the OJ group are not as extreme
as those in the VC group.
## Boxplot analysis
```{r boxplot analysis}
box1<-ggplot(TGdata,aes(x=dose,y=len,fill=supp))+
geom_boxplot(notch=FALSE)+
labs(title="Length of Odontoblasts as a Function of\n Supplement Type and Doseage",
x="Dose (mg/day)", y="Length",fill="Supplement \nType")+
theme(plot.title = element_text(hjust = 0.5))
box1
```
This boxplot better demonstrates the relationship between supplement type, dose,
and odontoblast length. In agreement with the previous outputs, it can be seen that
odontoblast length increases as the daily doseage increases across both supplement
types. The two supplement types do seem to differ in their efficacy at lower doseages.
The OJ supplement results in more odontoblast length than the VC Supplement at .5
and 1 mg/day doseges. At the 2mg/day dose, both groups have a similar median value,
but the OJ supplement has less spread than the VC supplement. Given these apparent
differences, I can now empirically test this using inferential methods
# Inferential Analysis
For the inferential analysis, I will be using indpendent samples t-tests. The
guinea pigs did not do every treatment condition, so a paired samples t-test would
not be appropriate. The mean difference in odontoblast length between supplement
type will be examined first. Then I will compare odontoblast length for doseage
.5mg vs. 1mg and 1mg vs. 2mg.
## Mean Difference by Supplement Type
Based on the EDA from earlier, it appears that OJ results in more tooth growth
than VC, thus I will conduct a one-sided t-test. This means that the null hypothesis
is that the mean difference in tooth growth between OJ and VC is 0. The alternative
hypothesis is that the mean difference in tooth growth between OJ and VC is greater
than 0. I am using greater than because I am hypothesizing that OJ results in
more tooth growth than VC
```{r mean difference supplement type}
t.test(len~supp,data=TGdata,alternative='greater')
```
### Supplement Type Results
The p-value from this one-sided t test is less than .05 which indicates that the
null hypothesis that the mean difference in odontoblast growth between
the OJ and VC group is equal to 0 is rejected. The output of the t test also provides the lower
interval of the confidence interval which does not contain 0 leading to the same
conclusion as above. Thus it seems that mean growth of odontoblasts is greater
when vitamin C is administered via orange juice rather than ascorbic acid.
## Mean Difference by Dose
Since there are 3 levels of administered dose, I will do two one-sided t tests
to determine if the length of odontoblasts increases with increasing doses. The null
hypotheses for these analyses are that the mean differences in odontoblast length
between .5 mg and 1 mg treatment groups (or 1 mg and 2 mg) is equal to zero. The
alternative hypotheses are that the average difference in odontoblast growth
between the 1 mg treatment and the .5 mg (or 2 mg and 1 mg) is greater than 0. To
test this the same t test function with be used but with the alternative argument
set to less.
```{r mean difference doseage levels .5-1}
#To test the .5 mg and 1 mg groups
with(TGdata,t.test(len[dose==.5],len[dose==1],alternative='less'))
```
### .5 mg vs 1 mg Results
The results from this analysis show that the null hypothesis is rejected as the
p-value is (much) less than .05. This is also reflected in the confidence interval
which does not contain zero. Thus guinea pigs that receive 1 mg of Vitamin C have
greater odontoblast growth than guinea pigs that receive .5 mg of Vitamin C.
```{r mean difference doseage levels 1-2}
#To test the 1 mg and 2 mg groups
with(TGdata,t.test(len[dose==1],len[dose==2],alternative='less'))
```
### 1 mg vs. 2 mg Results
Similar to the previous t test, the null hypothesis is rejected as the p-value is
(much) less than .05. Thus guinea pigs that receive 2 mg of Vitamin C have
greater odontoblast growth than guinea pigs that receive 1 mg of Vitamin C.
## Overall Conclusions
The results from all the t test analyses indicate that both the delivery method
and the doseage influence the length of odontoblasts in guinea pigs. Orange juice
results in more odontoblast growth than ascorbic acid, and larger doses of Vitamin
C result in more odontoblast growth. It should be noted that one-sided t tests
were used and unequal variance among the compared groups was assumed. There were
also three comparisons done with no multiple correction methods employed which will
result in an inflated type I error rate. Future investigations would benefit from
delving into the interaction between supplement type and doseage. Based on the
exploratory data analysis, it seems that the supplement type had an effect at lower
doses, but not at the highest dose.
### Appendix
```{r, ref.label=knitr::all_labels(),echo=TRUE,eval=FALSE}
```