This repo covers the projects completed during my final year at UPenn, where we were expected to apply the data lifecycle and statistical concepts to each step in analysis, visualization, and ShinyApp integration.
Objectives:
- Analyze and visualize health data related to risk factors using statistical methods.
- Investigate correlations between various health indicators and risk factors.
Learnings:
- Applied basic data analysis techniques to interpret health-related datasets.
- Utilized statistical methods to identify significant relationships and trends.
- Gained experience in data visualization and preliminary analysis.
Objectives:
- Perform data transformation and manipulation tasks on a dataset to prepare it for analysis.
- Visualize the transformed data to uncover patterns and insights.
Learnings:
- Mastered data wrangling techniques including reshaping and aggregating data.
- Used visualization tools to effectively communicate insights from transformed data.
- Enhanced skills in data preprocessing and exploratory data analysis.
Objectives:
- Conduct regression analysis to examine the relationships between dependent and independent variables.
- Interpret the results to understand how different variables influence the outcome.
Learnings:
- Applied regression techniques to analyze data and draw conclusions about variable relationships.
- Interpreted regression outputs to derive meaningful insights.
- Developed a deeper understanding of statistical modeling and its applications.
Objectives:
- Develop an interactive Shiny app to visualize and compare risk factors across different Philadelphia zip codes.
- Enable users to select zip codes and visualize comparative data on various risk indices.
Learnings:
- Mastered Shiny's reactive programming to filter and display data dynamically.
- Designed a user-friendly interface with dropdown menus and interactive plots.
- Gained experience in data visualization and interactive web application development using R.
Objectives:
- Clean and categorize Clery crime log data to identify and analyze patterns in theft, DUI, and drug offenses.
- Create dummy variables and extract relevant information for further analysis.
Learnings:
- Applied data cleaning techniques to preprocess and standardize crime data.
- Utilized regular expressions for data extraction and classification.
- Managed and transformed complex text data for insightful analysis.
Objectives:
- Generate word clouds from ANES 2016 survey responses to visualize what respondents liked and disliked about Donald Trump and Hillary Clinton.
- Compare and analyze sentiments expressed towards the political figures.
Learnings:
- Leveraged text mining and word cloud generation to visualize survey responses.
- Cleaned and preprocessed textual data to enhance clarity and relevance in visualizations.
- Interpreted visual data to derive insights into public opinions and sentiments.
Objectives:
- Use SQL to explore and analyze a large dataset of occupational employment statistics.
- Execute queries to extract, filter, and visualize key data metrics.
Learnings:
- Gained proficiency in SQL and database management with SQLite.
- Conducted data aggregation and filtering to derive meaningful insights.
- Created visualizations to compare median annual salaries across various occupation titles.
Objectives:
- Develop a Shiny app to analyze and visualize the impact of various risk factors on safety in Philadelphia.
- Allow users to select risk factors and view the results of a regression analysis.
Learnings:
- Implemented a regression model within a Shiny app to assess risk factors dynamically.
- Designed a polished user interface with interactive elements and data visualizations.
- Utilized
stargazer
to present regression results effectively, enhancing user comprehension.