This repository was created to support the project:
It contains all the material, relevant to the project:
- RepRes_analysis.Rmd
- The main script of the project, that contains all the code and
verbose required to produce the:
- RepRes_analysis.md
- The Markdown version of the report. It is not advised to
examine the analysis from this file, as it couldn’t
present all the included code effectively (extra chunks
of html were used for various proposes).
In addition, due to the extensive length of the Report it isn’t very practical either.
- The Markdown version of the report. It is not advised to
examine the analysis from this file, as it couldn’t
present all the included code effectively (extra chunks
of html were used for various proposes).
- docs/Report.html
- The version of the Report that was obtained by rendering the RepRes_analysis.Rmd with the script render_____RepRes_analysis.R to obtain a bookdown version of the report, powered by the rmdformats library. A visually appealing, fully functional and practical version of the Report that includes all the code used for the analysis (folded by default).
- RepRes_analysis.md
- The main script of the project, that contains all the code and
verbose required to produce the:
- repdata_data_StormData.csv.bz2
- The file with the compressed unprocessed data, on which this analysis is based on, which contains data from the Storm Events Dataset gathered and made publicly available by U.S. National Oceanic and Atmospheric Administration (NOAA).
- supplemental_information
- Contains files with information on the Storm Events Dataset
that had major impact on the analysis:
- supplemental_information/NATIONAL WEATHER SERVICE
INSTRUCTION 10-1605 (AUGUST 17, 2007)
- General information and directions for the Storm Events Dataset.
- supplemental_information/NCDC Storm Events-FAQ Page.pdf
- Frequently asked questions about the Storm Events Dataset.
- supplemental_information/The-History-of-the-Storm-Events-Database.docx
- Detailed information for the history of the Storm Events Dataset.
- supplemental_information/NATIONAL WEATHER SERVICE
INSTRUCTION 10-1605 (AUGUST 17, 2007)
- Contains files with information on the Storm Events Dataset
that had major impact on the analysis:
- outputs
- contains the outputs that were produced (as a side effect)
during the execution of the script:
- outputs/processed data
- Contains the resulting table with the processed data,
that was obtained after the unprocessed data was loaded
in R and processed through the data processing pipeline:
- outputs/processed data/table_with_the_processed_data.R
- Contains the resulting table with the processed data,
that was obtained after the unprocessed data was loaded
in R and processed through the data processing pipeline:
- outputs/harm_on_population_health
- Contains the outputs with results for the harm on
population health:
- outputs/harm_on_population_health/figures
- Contains the Figure 1 with the overview of
harm on population health by each weather event
type:
- outputs/harm_on_population_health/figures/figure_1.png
- Contains the Figure 1 with the overview of
harm on population health by each weather event
type:
- outputs/harm_on_population_health/results
- Contains the files with the tables with the
results for the harm on population health by
each weather event type for each of the
perspectives examined in the analysis:
- outputs/harm_on_population_health/results/summary_____harm_on_population_health______fatalities.R
- outputs/harm_on_population_health/results/summary_____harm_on_population_health______injuries.R
- outputs/harm_on_population_health/results/summary_____harm_on_population_health______casualties.R
- Contains the files with the tables with the
results for the harm on population health by
each weather event type for each of the
perspectives examined in the analysis:
- outputs/harm_on_population_health/figures
- Contains the outputs with results for the harm on
population health:
- outputs/harm_on_economy
- Contains the outputs with results for the harm on economy:
- outputs/harm_on_economy/figures
- Contains the Figure 2 with the overview of harm on
economy by each weather event type:
- outputs/harm_on_economy/figures/figure_2.png
- Contains the Figure 2 with the overview of harm on
economy by each weather event type:
- outputs/harm_on_economy/results
- Contains the files with the tables with the results
for the harm on economy by each weather event type
for each of the perspectives examined in the
analysis:
- outputs/harm_on_economy/results/summary_____harm_on_population_health______property_damage.R
- outputs/harm_on_economy/results/summary_____harm_on_population_health______crop_damage.R
- outputs/harm_on_economy/results/summary_____harm_on_population_health______economic_damage.R
- Contains the files with the tables with the results
for the harm on economy by each weather event type
for each of the perspectives examined in the
analysis:
- outputs/reproducibility_support
- Contains information that can assist any attempt to
reproduce the analysis.
- outputs/reproducibility_support/r_session
- Information on the R session and the options
that were active when the original analysis took
place:
- outputs/reproducibility_support/r_session/session_info.R
- outputs/reproducibility_support/r_session/r_options.R
- Information on the R session and the options
that were active when the original analysis took
place:
- outputs/reproducibility_support/MD5_checksum
- Contains files with the MD5 checksums for the
unprocessed data, processed data, and the tables
with the results that were produced during the
original execution of the script:
- outputs/reproducibility_support/MD5_checksum/unprocessed_data_____MD5_checksum.txt
- outputs/reproducibility_support/MD5_checksum/processed_data_____MD5_checksum.txt
- outputs/reproducibility_support/MD5_checksum/results_____MD5_checksum.txt
- Contains files with the MD5 checksums for the
unprocessed data, processed data, and the tables
with the results that were produced during the
original execution of the script:
- outputs/reproducibility_support/r_session
- Contains information that can assist any attempt to
reproduce the analysis.
- outputs/processed data
- contains the outputs that were produced (as a side effect)
during the execution of the script:
- docs
- contains the files used to create and populate the GitHub Page that was created to showcase this project:
This project was created for the 2nd peer-graded assignment of:
Course 5: Reproducible Research,
from Data Science Specialization,
by Johns Hopkins University,
at Coursera
The course is taught by:
- Jeff Leek, PhD
- Roger D. Peng, PhD
- Brian Caffo, PhD
As putted by the teachers of the course:
The basic goal of this assignment is to explore the NOAA Storm Database and answer some basic questions about severe weather events. You must use the database to answer the questions below and show the code for your entire analysis. Your analysis can consist of tables, figures, or other summaries. You may use any R package you want to support your analysis.
The assignment requests to address 2 questions:
Your data analysis must address the following questions:
Question 1: Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?
Question 2: Across the United States, which types of events have the greatest economic consequences?
based on the observation from the supplied dataset:
The data for this assignment come in the form of a comma-separated-value file compressed via the bzip2 algorithm to reduce its size.
Some quite general guidelines and a tip were provided:
Consider writing your report as if it were to be read by a government or municipal manager who might be responsible for preparing for severe weather events and will need to prioritize resources for different types of events. However, there is no need to make any specific recommendations in your report.
The events in the database start in the year 1950 and end in November 2011. In the earlier years of the database there are generally fewer events recorded, most likely due to a lack of good records. More recent years should be considered more complete.
It was deliberately decided to adopt a more educational approach aiming to produce a well-justified and self-explained product that can serve as guide to a beginner on how a basic pipeline can be constructed in order to obtain a report with an analysis from scratch.