This repository contains an optimized implementation of Convolutional Neural Networks (CNN), Transformer, and Vision Transformer (ViT) models.
Anonymous during paper submission process.
This project focuses on optimizing Vision Transformer training using GPU acceleration and multi-threading techniques. It provides implementations of popular deep learning models, including Convolutional Neural Networks (CNN), Transformer, and a customized version of Vision Transformer (ViT) tailored for improved performance.
- CNN: Implementation of Convolutional Neural Networks.
- Transformer: Implementation of the Transformer model.
- ViT: Customized version of the Vision Transformer (ViT) model, based on the vision-transformers-cifar10 repository.
- Python (>=3.6)
- Anaconda 3
- PyTorch
- CUDA-enabled GPU (for GPU acceleration)
-
Clone this repository:
git clone https://github.com/jonledet/vision-transformer.git
-
Create and activate a new Anaconda environment:
conda create --name your-env-name python=3.6 conda activate your-env-name
-
Install dependencies:
pip install -r requirements.txt
- To run the models, execute the corresponding Python script:
python cnn.pypython transformer.pypython vit.py- The Vision Transformer (ViT) model is based on the work from the vision-transformers-cifar10 repository.
- And Jonathan Ledet https://github.com/jonledet and his Vision Transformer: https://github.com/jonledet/vision-transformer repository
This project is licensed under the MIT License.