=================
Our team was interested in exploring healthcare data. We extracted pharmaceutical spending data across the globe to determine how much each country is spending on off the shelf pharmaceutical medicines. Data was extracted from CSV and JSON sources, transformed using Python, Pandas and SQL. Data was then loaded into PostgreSQL. SQLAlchemy was used with Flask to deploy results to a pharmaceutical web application.
This web application can be used by pharmaceutical companies to promote products globally.
Pharmaceutical and Population datasets were extracted from the following CSV and JSON sources. EDA was performed on the data.
- Pharmaceutical_Spending
- Format: CSV
- Size: 240 KB
- Population
- Format: JSON
- Size: 1 MB
Data was transformed and cleaned using Python, Pandas and SQL. Transformations include:
- Filtering the population dataframe for year = 2018. The year 2018 was the latest data available.
- Filtering based on ‘% of Pharmaceutical spending’
- Changing column datatypes
- Renaming columns
- Retrieving required columns and dropping unwanted columns to load into PostgreSQL
Due to the relational nature of the population and pharmaceutical spending data, we decided to use PostgreSQL. Pandas was used to load dataframes into the database. Tables were merged on ‘country code’ using SQL in PostgreSQL. SQLAlchemy was then used with Flask to deploy results to an HTML page.
A Flask application was created to display data for the following routes:
- /
- /population
- /pharma_spending
- /population_pharma_spending
- Ensure Flask is installed in the virtual environment: 'python -m pip install flask'.
- Clone this repository to run on your local machine.
- In the virtual environment, navigate to the 'Pharmaceutical App' folder.
- In the virtual environment, run the app by using the command'python app.py'
- To open your default browser to the rendered page, Ctrl+click the http://127.0.0.1:5000/ URL in the terminal.
- On the webpage, click 'Routes' to view and explore Population, Pharmaceutical and Pharmaceutical Spending data.