This repository contains the documents asked for in the project from the Coursera Getting and Cleaning Data course.
The script presented in the file run_analysis.R does the following:
- Checks if the dataset exists in the current directory and downloads it otherwise
- Imports the
plyrpackage - Loads the necessary files to perform the analysis into R, using the function unz() to create the connection:
activity_labels.txt.features.txt.X_test.txt.y_test.txt.subject_test.txt.X_train.txt.y_train.txt.subject_train.txt.
- Merges the data from the training and test files, using the function rbind()
- Creates a boolean vector with the column names corresponding to the mean() and std() values, using the function grep to subtract this information from the features vector
- Creates a dataframe corresponding to the subset of the specified columns
- Updates the corresponding names to the activities in the data
- Correct the column names by making them lower case and adding the “subject” and “activity” columns
- Merge all data with the function cbind()
- Creates a new tidy dataset using the ddply() function and stores it in the variable
AveragesDatathat contains the average of each variable for each activity and each subject - Writes the text file
tidy_dataset.txtthat contains the resulting tidy dataset