Skip to content

This is repository contains our work for an AI-Challenge with HeadMindPartners, taking part of CentraleSupélec 3rd year

Notifications You must be signed in to change notification settings

ErwanDavidCode/CNN_embedding_matching

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ChallengeAI w/ HeadmindPartners

This repository contains our work for the ChallengeIA, part of our third year at CentraleSupélec in the AI track.

This challenge is carried out in partnership with Headmind Partners.

CS

Headmind

Guidelines

Data:

The Dior dataset consists of two folders and a CSV file:

data/

├── DAM/
│ ├── 01BB01A2102X4847.jpeg
│ ├── ...
│ ├── test_image_headmind/
│ ├── image-20210928-102713-12d2869d.jpeg
│ ├── ...

└── product_list.csv

  • The "DAM" folder contains all the reference JPEGs for each item (2,766 items). The name of each JPEG corresponds to its MMC referenced in the CSV. Each image is 256x256 pixels.

  • The second folder, "test_image_headmind", contains the test images (80 test images). All items in these images are referenced in DAM and the CSV file. The size of these images varies. The images are not annotated. The file name follows the naming convention of the camera.

  • The "product_list" CSV file includes the unique MMC code for each item as well as the Product_BusinessUnitDesc specifying the class of the item (Bags, Shoes, etc).

Objective:

The goal of the project is to retrieve the reference of an item from a photo of it. Therefore, the visual characteristics of the objects must be used to identify the item.

Example: For example, given the image ./test_image_headmind/IMG_6880.jpg, the model should return the image ./DAM/BOBYR1UXR42FR.jpeg.

Method

This challenge allowed us to implement a pipeline to retrieve references of luxury items photographed, using only the image as data.

The methods used in this project are:

  • Image processing and data formatting (background removal, cropping, resizing)
  • Data augmentation (flip, rotation, color)
  • Transfer learning, based on ResNet-50, DINOV2, CLIP, ...
  • Fine tuning
  • Pipeline benchmarking

Pipeline

Results

The most effective model is ResNet-50, leveraging data augmentation with horizontal flips and using the cosine metric. The model is evaluated based on accuracy, extended to top 3 and top 5: a prediction is considered correct when the exact expected product reference is proposed by the model in its top 3 (respectively top 5).

Top 1 Accuracy Top 3 Accuracy Top 5 Accuracy
45% 60% 69%

We designed our solution with a product-oriented vision, keeping in mind the final utility for the client. This product can typically be integrated into a mobile application so that the client can obtain the exact reference of a scanned product in-store using their phone, or so that the seller can perform inventory by scanning items in the shop.

Presentation

The presentation used for our defense can be found at this link

About

This is repository contains our work for an AI-Challenge with HeadMindPartners, taking part of CentraleSupélec 3rd year

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •