Exploring Socioeconomic Indicators, Schools, and Crime
View Notebook @ https://github.com/Davidsonity/Chicago-Data-Insights/blob/main/notebook.ipynb
This project involves analyzing three datasets related to Chicago: Socioeconomic Indicators in Chicago, Chicago Public Schools, and Chicago Crime Data. The project begins by establishing a connection to the database using the ibm_db_sa
module. Once the connection is established, SQL queries are executed to solve a series of problems.
The repository contains the following files:
README.md
: This file provides an overview of the project, instructions for setup, and details about the datasets and analysis.notebook.ipynb
: This Jupyter Notebook file contains the code and analysis for the project. It includes SQL queries to load the datasets into an IBM DB2 database and perform various data analysis tasks.
The datasets used in this project are available on the city of Chicago's Data Portal. They can be downloaded as CSV files from the following links:
To replicate the analysis and run SQL queries on the datasets, follow these steps:
- Ensure you have access to an IBM DB2 database.
- Download the three datasets mentioned above.
- Import the datasets into the DB2 database as separate tables using the provided CSV files.
- Open the
notebook.ipynb
file in a Jupyter Notebook environment. - Connect to the DB2 database using the appropriate credentials.
- Execute the SQL queries provided in the notebook to perform the desired analysis.
The notebook.ipynb
file contains the code and SQL queries necessary to perform the analysis on the datasets. It provides a step-by-step guide for loading the datasets into the DB2 database and conducting various data analysis tasks. Here's a brief summary of each problem and its solution:
- Problem 1: Finding the total number of crimes recorded in the CRIME table.
- Problem 2: Listing community areas with per capita income less than 11000.
- Problem 3: Listing all case numbers for crimes involving minors.
- Problem 4: Listing all kidnapping crimes involving a child.
- Problem 5: Determining the kinds of crimes recorded at schools.
- Problem 6: Listing the average safety score for each type of school.
- Problem 7: Listing the 5 community areas with the highest percentage of households below the poverty line.
- Problem 8: Identifying the most crime-prone community area.
- Problem 9: Using a sub-query to find the name of the community area with the highest hardship index.
- Problem 10: Using a sub-query to determine the community area name with the most number of crimes.
The following dependencies are required to run the project:
- Python 3
- Jupyter Notebook
- ibm_db_sa
- pandas
- sqlalchemy
Please refer to the requirements.txt
file for the complete list of dependencies.
To get started with the project, follow these steps:
- Clone the repository
- Install the dependencies
- Set up the database connection details in the Jupyter Notebook.
- Open
notebook.ipynb
and run the cells to execute the SQL queries and explore the data.
By analyzing the Socioeconomic Indicators, Chicago Public Schools, and Chicago Crime Data, this project aims to provide insights into various aspects of Chicago's communities, education system, and crime rates. The use of SQL queries on an IBM DB2 database allows for efficient analysis and data-driven decision-making.
Please refer to the provided datasets, set up the DB2 database accordingly, and execute the SQL queries in the notebook.ipynb
file to explore and analyze the data.