We will be helping the chief data scientist for a City School District, Maria with analyzing data on students funding and students standardized test scores. We are given access to every student’s math and reading test scores as well as various information on the schools that they attend. We are given two datasets in CSV format: school_complete.csv and students_complete.csv.
This analysis will assist the School Board and Superintendent in making decisions regarding the school budgets and priorities.
- Our task is to aggregate the data and showcase trends and school performance.
- Additionally, since the school board has notified Maria that the students_complete.csv file shows evidence of academic dishonesty (reading and math grades for Thomas High School ninth graders appear to have been altered.), we need to replace the math and reading scores for Thomas High School with NaNs while keeping the rest of the data intact. Once it's done, we need to repeat the school district analysis and write up a report to describe how these changes affected the overall analysis.
- Averagage Math Score went down by 0.1
- % Passing Math - down by 0.2
- % Passing reading - down by 0.3
- % Overall Passing - down by 0.1
After modifications the results for schools, other than Thomas High School, were not affected.
Replacing the ninth-grade scores affect Math and reading scores by grade, Scores by school spending, Scores by school size, Scores by school type to the very minimal level.
As shown above the difference results for the schools under consideration before and after the modifications in Thomas Hight School test scores are insignificant. Therefore, both the analysis could be taken into consideration by the School Board to support them with the decisions regarding the school budgets and priorities.