Skip to content

nngidi/Penguins-Exploratory-Data-Analysis-EDA-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Penguins Exploratory Data Analysis (EDA)

Overview

This project focuses on Exploratory Data Analysis (EDA) of the Penguins dataset, which contains information about different species of penguins. The dataset includes measurements such as bill length, bill depth, flipper length, body mass, and the species of the penguins observed. The aim is to clean the data, perform basic statistical analysis, and visualize data distributions and correlations among key features.

Dataset

The dataset used in this project is the Palmer Penguins dataset, which is publicly available. The dataset can be downloaded from Palmer Penguins GitHub Repository. The dataset includes the following columns:

  • species: The species of the penguin (e.g., Adelie, Chinstrap, Gentoo).
  • island: The island where the penguin was observed.
  • bill_length_mm: The length of the penguin's bill in millimeters.
  • bill_depth_mm: The depth of the penguin's bill in millimeters.
  • flipper_length_mm: The length of the penguin's flipper in millimeters.
  • body_mass_g: The body mass of the penguin in grams.
  • sex: The sex of the penguin (male or female).

Tools Used

This project utilizes the following tools and libraries:

  • Python: Programming language used for data analysis.
  • Pandas: Library for data manipulation and analysis.
  • Matplotlib: Library for creating visualizations.
  • Seaborn: Statistical data visualization library based on Matplotlib.
  • Jupyter Notebook: Interactive notebook environment for writing and executing code.

Project Structure

Getting Started

  1. Clone the Repository:
    git clone https://github.com/nngidi/penguins-eda.git
    cd penguins-eda

pip install pandas matplotlib seaborn

jupyter notebook eda_penguins.ipynb