In this project, we analyzed the infant mortality data from Center for Disease Control and Prevention (CDC) and desigend a framework to
- Evaluate the risk of infant death
- Provide reference to similar pregnency cases in tha past to help the doctors in taking informed decisions.
Data Link: CDC Data
Methods Used:
- Spark for Data Processing
- Dimensionality Reduction to reduce the number of features
- Similary Search and Clustering to find the similar pregncnecy cases happened in the pas
- Machine Learning to evalute the risk of new infant death
For complete details refer Report