Skip to content

Analysis on Crime Data set using PCA, Linear Regression and Cross Validation

Notifications You must be signed in to change notification settings

Akhileshkumarkc/CrimeDataset

Repository files navigation

Analysis on Crime Dataset.

Dataset : http://archive.ics.uci.edu/ml/datasets/Communities+and+Crime

Using the R Programming Lanaguage used "PCA" and Linear Regression with Cross Validation with K=10 to predict "Violent Crimes per Population".

Crime Dataset contains 127 attribute, with target attribute ViolentCrimesPerPop. This is carried out in following steps:

  • Step 1: Read the dataset.
  • Step 2: Clean the dataset.
  • Step 3: DO PCA
  • Step 4: Apply linear regression with cross validation with k=10 using glm.net
  • Step 5: Calculate MSE

Results

  • MSE On Test data : 0.05366028
  • MSE On Train data : 0.05791931

Also used Weka tool to compare the different methods for analysis as well.

About

Analysis on Crime Data set using PCA, Linear Regression and Cross Validation

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages