This code book explains the variables, summaries calculated for the programming assignment including any data transformations applied
The site where the data was obtained:
The data for the project:
Except for adding two new variables to the summarized data set (tidy data file), all the other variable names and description are the same as in the README.txt of the source data with the only Exception that the tidy data set has the variables averaged for each Subject and the Activity.
- Used the 'features' data to extract the column names required for the final data set i.e the mean and std of the measurements.
- Also did clean up on the feature names to remove the "()" to beautify the column names of the final data set
- After reading both the TEST and TRAIN versions of the data sets, removed the columns not required using the filtered 'features' list
- Added the Subjects and Activities as new columns to both the TEST and TRAIN data sets before merging them as these were provided on separate files.
- Merged (basically appended) the TEST and TRAIN data frames by rows (instead of columns)
- Added a factor to replaced activity codes by labels denoting the activities in the merged data set
- Grouped the merged data set by Subjects and then Activities and finally summarized each activity group by taking a average/mean of the measurements for that Subject and Activity.