This project addresses problem of early detection of Parkinson disease using Machine learning techniques mptom progression often uses the Unified Parkinson’s Disease Rating Scale (UPDRS), which requires the patient's presence in clinic, and time-consuming physical examinations by trained medical staff. It makes early detection of disease difficult. However, in recent studies remote replication of UPDRS has appeared as an alternative approach to predict this disease. Wherein, range of biomedical voice measurements were captured using telemonitoring deive installed at patient's home. These biomedical voice measurements meaures the remote symptom progression.
Researchers Athanasios Tsanas and Max Little of the University of Oxford, in collaboration with 10 medical centers in the US and Intel Corporation developed the telemonitoring device to record the speech signals and presented it in a form of data set for analysis purpose. The data set is available on UCI repository weblink https://archive.ics.uci.edu/ml/datasets/Parkinsons+Telemonitoring . The data set is composed of a range of biomedical voice measurements from 42 people with early-stage Parkinson's disease recruited to a six-month trial of a telemonitoring device. It has 5875 cases represented in 20 features.
The details of features are as below:
-
age - Subject age
-
sex - Subject gender '0' - male, '1' - female
-
test_time - Time since recruitment into the trial. The integer part is the number of days since recruitment
-
motor_UPDRS - Clinician's motor UPDRS score, linearly interpolated
-
total_UPDRS - Clinician's total UPDRS score, linearly interpolated
-
Jitter(%)
-
Jitter(Abs)
-
Jitter:RAP
-
Jitter:PPQ5
-
Jitter:DDP - Several measures of variation in fundamental frequency
-
Shimmer
-
Shimmer(dB)
-
Shimmer:APQ3
-
Shimmer:APQ5
-
Shimmer:APQ11
-
Shimmer:DDA - Several measures of variation in amplitude
-
NHR
-
HNR - Two measures of ratio of noise to tonal components in the voice
-
RPDE - A nonlinear dynamical complexity measure
-
DFA - Signal fractal scaling exponent
-
PPE - A nonlinear measure of fundamental frequency variation
Given this medical data set, the objectives of this project is to:
-
Predict total_UPDRS score given voice measurements as features
-
Identify important features that influence total_UPDRS score
-
Relational strength between total_UPDRS score and other features present in the data set
-
Discover if the diease is age specific