Skip to content

Explore the world of healthcare data analysis with Python 🐍 and SQL πŸ—„οΈ. This project delves into three key medical datasets πŸ₯, uncovering trends and insights that can revolutionize healthcare decision-making and resource optimization πŸ“Š.

Notifications You must be signed in to change notification settings

zeyamosharraf/Medical_dataset_analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

7 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Medical Dataset Analysis: Python, SQL, and Insights

Project Overview

The "Medical Dataset Analysis: Python, SQL, and Insights" project is a comprehensive exploration of healthcare data analysis using Python, SQL, and data visualization techniques. The project focuses on three critical datasets: "hospitalization_details," "medical_examinations," and "names," which are interconnected and provide a holistic view of patient health profiles, hospitalization charges, and other relevant information.

Tools Used

  1. Python
  2. Pandas
  3. SQL
  4. Jupyter Notebook

Project Goals

The primary goals of this project include:

  1. Deriving insights from medical datasets
  2. Identifying trends in hospitalization charges and patient health profiles
  3. Analyzing the impact of variables such as BMI and smoking on healthcare costs

Key Findings

Through our analysis, we discovered several key insights, including:

  1. High charges associated with certain medical conditions and surgeries
  2. Variation in charges based on hospital tier and city tier
  3. The impact of BMI on healthcare costs
  4. Trends in hospitalization charges over the years

Challenges Faced

During the project, we encountered challenges such as:

  1. Handling null values and duplicates in the datasets
  2. Joining and merging multiple datasets for comprehensive analysis
  3. Ensuring data accuracy and consistency throughout the analysis
  4. To overcome these challenges, we employed various data cleaning and preprocessing techniques, as well as utilized the power of SQL for complex data queries.

Results

  1. Average Hospital Charges: The average hospital charges across all records are $13,564.60.
  2. High Charges Analysis: Identified customers with charges exceeding $700.
  3. High BMI Patients Analysis: Listed customers with BMI over 35 and their corresponding charges.
  4. Customers with Major Surgeries: Listed customers who have undergone major surgeries.
  5. Average Charges by Hospital Tier in 2000: Calculated the average charges per hospital tier for the year 2000.
  6. Smoking Patients with Transplants Analysis: Retrieved customers who are smokers and have undergone transplants.
  7. Patients with Major Surgeries or Cancer History: Identified customers with a history of major surgeries or cancer.
  8. Customer with Most Major Surgeries: Identified the customer with the highest number of major surgeries.
  9. Customers with Major Surgeries and City Tiers: Compiled a list of customers who have undergone major surgeries and their respective city tiers.
  10. Average BMI by City Tier in 1995: Calculated the average BMI for each city tier level in the year 1995.
  11. High BMI Customers with Health Issues: Extracted customers with health issues and a BMI greater than 30.
  12. Customers with Highest Charges and City Tier by Year: Identified the customer with the highest total charges for each year and displayed their corresponding city tier.
  13. Top 3 Customers with Highest Average Yearly Charges: Identified the top 3 customers with the highest average yearly charges.
  14. Ranking Customers by Total Charges: Ranked customers based on their total charges over the years in descending order.
  15. Identifying Peak Year for Hospitalizations: Identified the year with the highest number of hospitalizations.

Conclusion

The "Medical Dataset Analysis: Python, SQL, and Insights" project has been a journey of exploration and discovery into the world of healthcare data. Through meticulous data cleaning, powerful SQL queries, and insightful analysis, we've uncovered valuable trends and patterns in medical datasets that can revolutionize healthcare decision-making.

Our analysis has revealed insights into hospitalization charges, BMI distribution, smoking habits, and more, providing a deeper understanding of healthcare costs and patient profiles. These insights have the potential to drive data-powered improvements in healthcare delivery, resource allocation, and patient care strategies.

As we conclude this project, we're reminded of the transformative power of data analysis in healthcare. By harnessing the tools and techniques of Python, SQL, and data visualization, we've taken a step towards a future where data-driven insights lead to better healthcare outcomes for all.

About

Explore the world of healthcare data analysis with Python 🐍 and SQL πŸ—„οΈ. This project delves into three key medical datasets πŸ₯, uncovering trends and insights that can revolutionize healthcare decision-making and resource optimization πŸ“Š.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published