Skip to content

dfavenfre/Olivetti-Faces-PyTorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Olivetti-Faces

This repository contains a custom Convolutional Neural Network (CNN) model implemented using PyTorch, designed for image classification using the Olivetti-Faces dataset. The dataset contains 400 grayscale images of 40 unique individuals (10 images per individual), making it a perfect dataset for facial recognition and classification tasks.

Dataset: Olivetti-Faces

The Olivetti-Faces dataset includes:

images

Number of Classes: 40 individuals Total Images: 400 (10 images per individual) Image Size: 64x64 pixels (grayscale) Challenge: The dataset contains variations in facial expressions, lighting conditions, and angles.

Model Architecture

The CNN model is built from scratch using PyTorch and follows a standard architecture with 3 convolutional layers, ReLU activations, max-pooling, and fully connected layers. The model was trained on a Google Colab A100 Tesla GPU (CUDA-enabled), and the training took approximately 3.5 hours.

Key model layers:

  • Three Convolutional Layers with increasing hidden dimensions.
  • Max Pooling Layer for spatial downsampling.
  • Fully Connected Layers to produce class predictions.
  • ReLU Activations to introduce non-linearity.
class NNModel(nn.Module):

    def __init__(
        self,
        input_dim: int,
        hidden_dim: int,
        output_dim: int,
        kernel_size: int,
        stride: Tuple[int, int],
        pooling_size: int,
        device: torch.device
    ):
        super(NNModel, self).__init__()

        self.conv1 = nn.Conv2d(
            in_channels=input_dim,
            out_channels=hidden_dim,
            kernel_size=kernel_size,
            stride=stride,
            padding=1,
            device=device,
            dtype=torch.float32
        )

        self.conv2 = nn.Conv2d(
            in_channels=hidden_dim,
            out_channels=hidden_dim,
            kernel_size=kernel_size,
            stride=stride,
            padding=1,
            device=device,
            dtype=torch.float32
        )

        self.conv3 = nn.Conv2d(
            in_channels=hidden_dim,
            out_channels=hidden_dim,
            kernel_size=kernel_size,
            stride=stride,
            padding=1,
            device=device,
            dtype=torch.float32
        )

        self.maxpooling = nn.MaxPool2d(
            kernel_size=pooling_size,
            stride=pooling_size
        )

        self.flat = nn.Flatten()
        self.relu = nn.ReLU()

        self.dense_layer = nn.Linear(
            in_features=hidden_dim * 32 * 32,
            out_features=hidden_dim,
            device=device,
            dtype=torch.float32
        )

        self.output_layer = nn.Linear(
            in_features=hidden_dim,
            out_features=output_dim,
            device=device,
            dtype=torch.float32
        )


    def initialize_weights(self):
        for layer in self.children():
            if isinstance(layer, nn.Conv2d):
                nn.init.xavier_uniform_(layer.weight)
                if layer.bias is not None:
                    nn.init.zeros_(layer.bias)
            elif isinstance(layer, nn.Linear):
                nn.init.xavier_uniform_(layer.weight)
                nn.init.zeros_(layer.bias)

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        x = self.conv1(x)
        x = self.relu(x)
        x = self.conv2(x)
        x = self.relu(x)
        x = self.conv3(x)
        x = self.relu(x)
        x = self.maxpooling(x)
        x = self.flat(x)
        x = self.dense_layer(x)
        x = self.relu(x)
        x = self.output_layer(x)

        return x

Preprocessing

Dataset was separated as inner & outer dataset as 90% / 10%, respectively, and of 75% inner dataset was used for training, and the remaining 25% was used for inner validation. Outer 10% was used for a later evaluation to test whether results with unseen data check out with trained results.

Training Results

The model was trained for a total of 3.5 hours, and the following graphs were generated using Weights and Biases to monitor the training progress: Navigate to the following link to see the training results

  • Epoch: 256
  • Learning Rate: 1e-5
  • Validation Accuracy: 0.9333

Training Loss Validation Loss Validation Accuracy Hyperparameter Importance

Download Training Results As PDF

Olivetti Faces Dataset CNN Model Training With PyTorch.pdf

Evaluation on Unseen Data

image