forked from ionides/810f24
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathhw04.tex
50 lines (33 loc) · 2.53 KB
/
hw04.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
\documentclass[12pt]{article}
\usepackage{fullpage,hyperref}\setlength{\parskip}{3mm}\setlength{\parindent}{0mm}
\begin{document}
\begin{center}\bf
Homework 4. Due by 11:59pm on Sunday 9/29.
Data and the reproducibility of research results
\end{center}
Read pages 8--11 of {\em On Being a Scientist} and \url{https://en.wikipedia.org/wiki/Data_sharing}. Write brief answers to the following questions, by editing the tex file available at \url{https://github.com/ionides/810f24}, and submit the resulting pdf file via Canvas.
\begin{enumerate}
\item National Institutes of Health (NIH) and National Science Foundation (NSF) require data sharing for research that they fund (excluding infringement of privacy considerations or other individual rights). To what extent do you suspect their rules on data sharing are enforced?
YOUR ANSWER HERE
\item As an academic statistician, you may have the opportunity to work with private data that cannot be shared. What are the advantages and disadvantages of working with unshareable data?
YOUR ANSWER HERE
\item Advanced statistical methods often require sophisticated computational implementations. Should statistical researchers be expected to make their code publicly available (e.g., on GitHub) when they publish results generated using this code?
YOUR ANSWER HERE
\end{enumerate}
The remaining questions consider the following hypothetical case study:
Ben is a Statistics PhD student who has written computer code for a simulation study to test a new statistical theory and methodology which he is developing.
He plans to put the results in his thesis and to publish them in a journal paper.
The results of the simulations are usually consistent with his theoretical analysis.
However, sometimes the code crashes, particularly when investigating more extreme values of the parameter space.
Ben has checked and rechecked the code very carefully, and cannot find any error.
He decides that there must be some weird numerical effect, perhaps to do with occasional extremely large or small numbers.
Ben decides to report the results only in the region of the parameter space where the code never crashed.
\begin{enumerate}\setcounter{enumi}{3}
\item Is Ben's course of action a reasonable balance between the necessity to make progress on his thesis and his desire to report correct results?
YOUR ANSWER HERE
\item What are the `data' in this example? What is `reproducibility' in this context?
YOUR ANSWER HERE
\item Ben asks your opinion on how to proceed. What is your advice?
YOUR ANSWER HERE
\end{enumerate}
\end{document}