This project is one of the core assignment in my school units aiming to predict the age of abalone large, slow-growing marine snails using physical measurements and weights, based on a dataset of ~4,000 harvested specimens. The goal is to find a non-invasive method of estimating age on the basis of Exploratory Data Analysis approaches, without needing to dissect the animal.
Abalone are large, slow-growing marine snails . In many parts of the world, they are an economically significant fishery, both as commercial operations and a traditionally-important food source for many cultures.
Abalone are harvested from the wild rather than farmed, and sustainability of a slow-growing resource is an important issue. A key issue is determining the age of a specimen. The rigorous approach is to harvest and dissect the specimen to count growth rings in the flesh. Obviously, a reliable non-fatal means of estimating specimen age is highly desirable.
This dataset, provided as abalone_growth.csv, collects measurements on around 4000 harvested specimens, and can be used to identify reliable predictors of sample age.
This dataset contains measurements of abalone, a type of marine mollusk, and is commonly used for age prediction and regression analysis. The data was collected from the Kaggle repository and Macquarie School data bank.
The dataset includes physical characteristics (size and weight measurements) of abalone.
The target variable is Rings, which represents the number of growth rings.
Following by 9 Feature Names with their descriptions below:
| Feature Name | Description |
|---|---|
Sex |
Categorical variable: M (Male), F (Female), I (Infant) |
Length |
Longest shell measurement (in mm) |
Diameter |
Perpendicular to length (in mm) |
Height |
With meat in shell (in mm) |
Whole weight |
Weight of whole abalone (in grams) |
Shucked weight |
Weight of meat (i.e., edible portion) |
Viscera weight |
Gut weight after bleeding (non-edible organs) |
Shell weight |
After being dried (in grams) |
Rings |
Number of growth rings → Age = Rings + 1.5 (approximate in years) |
- Cleaning the raw and dirty dataset
- Explore correlations between physical features and abalone age
- Apply the basic of machine learning models to predict age
- Formula the linear metrics between ages and rings
- Linear Regression
- Data Cleaning
- Age Prediction formular
- Numerical analysis
- Multi-variable plot
abalone_asm.ipynb: Jupyter notebook with EDA and modelingabalone_asm.py: Python script versionabalone_growth.csv: The entire datasetabalone_asm.pdf/html: Exported visible reports