Skip to content

pathan-07/ML-Challenge-2025

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Smart Product Pricing Challenge - ML Challenge 2025

This repository contains the solution for the ML Challenge 2025, a competition focused on predicting e-commerce product prices from their catalog descriptions and images. The solution employs a powerful and efficient two-stage, multimodal deep learning approach to tackle the problem.


🚀 Final Result

  • Best Validation SMAPE Score: 73.9450%

🧠 Methodology Overview

To handle the large dataset and the complexity of training multimodal models within a hackathon timeline, a Two-Stage Pre-computation Strategy was implemented. This approach decouples the slow feature extraction from the fast model training, allowing for rapid iteration.

Stage 1: Pre-computing Embeddings

The first stage involves using large, pre-trained deep learning models as feature extractors. These models are run only once to process the entire dataset and save the resulting high-dimensional feature vectors (embeddings).

  • Text Feature Extraction: A pre-trained distilbert-base-uncased model from the Hugging Face transformers library was used to convert product descriptions (catalog_content) into 768-dimension text embeddings.
  • Image Feature Extraction: A pre-trained efficientnet_b0 model from the timm library was used to convert product images into 1280-dimension image embeddings.

This process was executed in a memory-efficient manner by processing data in batches and saving each batch's embeddings directly to disk, preventing RAM crashes in the Colab environment.

Stage 2: Training a Lightweight Regression Head

The second stage involves training a small, fast neural network on the pre-computed embeddings.

  • Input: The text and image embeddings from Stage 1 are concatenated to form a single 2048-dimension feature vector for each product.
  • Model: A simple feed-forward neural network (Regression Head) with two hidden layers, BatchNorm, and Dropout was trained to map these features to the final price prediction.
  • Speed: This training process is incredibly fast, completing in just a few minutes on a GPU, which allows for extensive experimentation with hyperparameters like learning rate and model architecture.

📁 Repository Structure

.
├── ML_Challenge_2025/
│   ├── dataset/
│   │   ├── train.csv           # Training data
│   │   ├── test.csv            # Test data
│   │   └── images/             # Downloaded product images
│   │
│   ├── embeddings_batched/
│   │   ├── train_text/         # Saved text embeddings for training set
│   │   ├── train_image/        # Saved image embeddings for training set
│   │   ├── test_text/          # Saved text embeddings for test set
│   │   └── test_image/         # Saved image embeddings for test set
│   │
│   ├── ML_Challange.ipynb      # Main Colab notebook with all code
│   ├── fast_regression_model.pth # Saved weights of the trained model
│   └── test_out.csv            # Final submission file
│
└── README.md                   # You are here

⚙️ How to Run

This project was developed in Google Colab using a GPU runtime.

  1. Setup Google Drive:

    • Create a folder named ML_Challenge_2025 in your Google Drive.
    • Inside it, create a dataset folder and upload train.csv and test.csv.
    • Run an image downloader script to populate the dataset/images/ folder.
  2. Part 1 - Generate Embeddings:

    • Open the ML_Challange.ipynb notebook in Google Colab and set the runtime to GPU.
    • Run the "Part 1" code cells. This will process all text and images and save the embeddings into the embeddings_batched folder in your Drive. (Note: This is the slow part).
  3. Part 2 - Train and Predict:

    • Once Part 1 is complete, run the "Part 2" code cells in the same notebook.
    • This will load the saved embeddings, train the fast regression model, and generate the final test_out.csv file in your project directory.

🛠️ Key Libraries Used

  • PyTorch: Core deep learning framework.
  • Transformers (Hugging Face): For loading the DistilBERT text model.
  • timm (PyTorch Image Models): For loading the EfficientNet-B0 image model.
  • Pandas: For data manipulation.
  • Scikit-learn: For data splitting.
  • Pillow: For image processing.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published