This project interprets data from the National Parks Service about biodiversity and endangered species in different parks.
The goals of this project are to perform a data analysis of the conservation statuses of endangered species, investigate patterns related to the types of species that become endangered, and visualize the data to identify trends and distributions of endangered species across different national parks
Some questions that are posed:
- What is the distribution of conservation status for species?
- Are certain types of species more likely to be endangered?
- Are the differences between species and their conservation status significant?
- Which animal is most prevalent and what is their distribution amongst parks?
- analyze data;
- clean up the datasets;
- visualize the data using graphs and charts;
- seek to answer the questions;
- making conclusions based on the analysis.
There are two datasets:
- Species data - contains information about different species, including their conservation status.
- Observations data - contains records of species observed in various national parks, including the date and location of the observation.
In this section, we will employ descriptive statistics and data visualization methods to gain a deeper understanding of the data. Statistical analysis will also be conducted to determine if the observed species are statistically significant. Some of the key metrics that will be calculated include:
- Frequency distributions
- Counts
- Relationships between species
- Conservation status of species
- Observations of species in parks.