Boostcamp AI Level.1 P-stage : Mask Image Classfication

Task : Classify the person in the picture among 18 cateogires, which describe 'gender','age', 'well-mask-fit'

Overview of the Project

Goal

Classify the correct class for the given person in the image

Properties of Dataset

Dataset folder as follows : Each folder has 7 images with a single person but has different condition- 5 images w/ mask, a single image w/o mask, a single image w/ incorrect mask worn.
Total # of image samples: 2700 folders * 7 images = 18,900 images
Label information : 18 classes

class	Mask	Gender	Age
0	wear	Male	<30
1	wear	Male	>=30 & < 60
2	wear	Male	>=60
3	wear	Female	<30
4	wear	Female	>=30 & < 60
5	wear	Female	>=60
6	incorrect	Male	<30
7	incorrect	Male	>=30 & < 60
8	incorrect	Male	>=60
9	incorrect	Female	<30
10	incorrect	Female	>=30 & < 60
11	incorrect	Female	>=60
12	Not wear	Male	<30
13	Not wear	Male	>=30 & < 60
14	Not wear	Male	>=60
15	Not wear	Female	<30
16	Not wear	Female	>=30 & < 60
17	Not wear	Female	>=60

Class distribution : Please refer to EDA jupyter notebook.
Several noisy labels.
Main challenges : Comparably small dataset and the imbalanced properties' distributions, specially in the case of ages, lead to overall imbalance across final classes.

Metric

F1 score

Our Approaches

(IDEATION) How to handle OVERFITTING problems?

Handle class imbalance problems, specifically, the boundary of age between [30,60) and [60,) are AMBIGUOUS.
- (Data sampling approach) Oversampling & Undersampling: using opensource ImbalancedSampler and WeightedRandomSampler provided in pytorch.
- (Loss) Different weight based on the classses' distribution : WeightedCrossentropy, Focal Loss, LDAM Loss
- (Loss) Calculating loss considering F1 score metric : F1 Loss
- (Augmentation) Choose empirically helpful transformation(e.g, ColorJitter, HorizontalFlip,CenterCrop) and avoid harful transformation(e.g, VerticalFlip)
- (Augmentation) Removing backgrounds and crop so that our model can focus more on the face.
Small dataset? Easy to be overfitted.
- (Model) The family of EfficientNet: Consider the variants of light model.
- (Tunning policy) Full fine-tuning: empirically working better than freezing feature extractor layers.
- (Modifed version) Stacking additional FC layer : Relying solely on the pretrained feature extractor and single final classifier could not fit into our downstream task. Since all the pre-trained models are from nature image, ImageNet.

Main Remedies

F1 score: 0.7432 , Accuracy: 79.1905

Model : Efficientnet-b3 adding additional FC layer followed by dropout(0.7)

Augmentation : torchvision

Train:

  train_transform = transforms.Compose([transforms.Resize(args.img_resize),
                                    transforms.CenterCrop(args.img_crop),
                                    transforms.ColorJitter(brightness=0.5, contrast=0.5, saturation=0.5, hue=0.5),
                                    transforms.RandomHorizontalFlip(),
                                    transforms.ToTensor()])

Val :

val_transform = transforms.Compose([transforms.Resize(args.img_resize),
                                transforms.CenterCrop(args.img_crop),
                                transforms.ToTensor()])

Loss: Crossentropy
Optimizer: Adam
- Learning rate: 1e-4
- Scheduler : 0.995 @ every epoch

Additional Trials

Validation

Stratified 5-Fold
- By modifying the baseline code, we split folder k fold and validated them instead of using sklearn packages.

Usage

Installation

pip install -r requirements.txt

Scripts Example

For a single model : sh ./KSY/scripts/run_exp.sh
For K-fold with TTA, ensemble : sh ./KSY/scripts/run_kfold.sh

Hyperparameters (Arguments)

Please see argparse description in train.py, k_fold.py

Takeaway

Good

Need to Fix!

Avoiding hard-coding when sharing codes. ESPECIALLY directory or path.
Should have merged the baseline codes earlier?

About us


Roles	* data preprocessing * modeling(model, loss, ensemble) * tuning	* data preprocessing * tuning	* data preprocessing * modeling * wandb managing	* modeling * tuning	* tuning

Acknowledgements & References

ImbalancedDatasetSampler

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
CSW		CSW
EDA		EDA
HSJ		HSJ
IDJ		IDJ
KJS		KJS
KSY		KSY
docs		docs
.gitignore		.gitignore
README.md		README.md
dataset.py		dataset.py
dataset_fold.py		dataset_fold.py
inference.py		inference.py
k_fold.py		k_fold.py
loss.py		loss.py
model.py		model.py
sample_submission.ipynb		sample_submission.ipynb
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Boostcamp AI Level.1 P-stage : Mask Image Classfication

Overview of the Project

Goal

Properties of Dataset

Metric

Our Approaches

(IDEATION) How to handle OVERFITTING problems?

Main Remedies

Additional Trials

Validation

Usage

Installation

Scripts Example

Hyperparameters (Arguments)

Takeaway

Good

Need to Fix!

About us

Acknowledgements & References

About

Releases

Packages

Contributors 4

Languages

boostcampaitech3/level1-image-classification-level1-nlp-09

Folders and files

Latest commit

History

Repository files navigation

Boostcamp AI Level.1 P-stage : Mask Image Classfication

Overview of the Project

Goal

Properties of Dataset

Metric

Our Approaches

(IDEATION) How to handle OVERFITTING problems?

Main Remedies

Additional Trials

Validation

Usage

Installation

Scripts Example

Hyperparameters (Arguments)

Takeaway

Good

Need to Fix!

About us

Acknowledgements & References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages