Skip to content

Examining 33 causes of death in 206 countries over the course of 29 years.

Notifications You must be signed in to change notification settings

danicaboe/causes_of_death

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Causes of Death

33 causes of death. 206 countries. 29 years.

Death is universal and draws immense fear, pain, curiosity and facination. When I first came across this dataset, questions were easy to come by.

  • How did the number of deaths change over time? Did they decline, as expected with more advanced health care, or did they increase? Did any remain the same throughout the years? If so, was it the same everywhere or just certain countries/regions?
  • Are there countries that have higher death counts for most of the causes? In other words, a vast majority over others?
  • Where did each cause of death originate? Did it travel? Did it remain in one place? Did any of them get to near extinction in all countries?
  • What proportion of the population did each cause account for in each country? **I am merging a dataset with population data in order to calculate this.**
  • Which causes are contagious and which are not? Of those that are not contagious, do the numbers remain relatively stable across all countries? Are they more common in some and less common in others?
  • What cause resulted in the most deaths over all countries and years? Which resulted in the least?

There are so many hypotheses that can be gleaned from this dataset for further exploration. I will be using pandas and python to explore my data. I believe one of the best ways to understand this data is through an interactive map. I will be creating a map with Tableau to share when completed. As I continue to explore the data, I hope to find better questions. Seeking out more data to layer on top of the baseline I have set in order to add context and humanity to the numbers. There is only so much we can ask of this data set. There are so many other variables to look at to glean further understanding. I have questions that can't be answered with this data set because significantly more information is needed.

    My underlying questions are:
  • Why are some causes of death more common in specific countries? What other factors are at play? Poverty? Lack of clean water? Lack of health care and medication? Etc.
  • What role did the governments of these countries play in mitigating or exacerbating the cause of death? Or did they play no role?
  • Was the cause of death a result of a basic human right being unavailable?
  • What could change in each country to reduce the impact of their greatest cause of death?

I originally found this data set on Kaggle. The cleaned data in my ipynb file is downloaded from Kaggle. The uncleaned data originated from Our World in Data. I began with the cleaned data with the intention of connecting to an AWS database and use SQL queries and python to analyze the data. Instead of using AWS, I set up a local postgresql server database and copied the data in the uncleaned csv file into a table in my database. I will now go through the uncleaned data to explore and clean it using PostgreSQL. I found another dataset on Kaggle that provides the population per country from 1955 to 2020. I created a postgresql table and will join the population data with the causes of death data to explore what proportion of each countries population was affected by each cause of death.

About

Examining 33 causes of death in 206 countries over the course of 29 years.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published