Welcome to the Light Vision Transformer (LightViT) repository! This project contains an implementation of the Light Vision Transformer from scratch, optimized for efficient image recognition tasks on devices with limited computational resources.
Light Vision Transformer (LightViT) is a streamlined and efficient adaptation of the Vision Transformer (ViT) architecture. It is designed to deliver high performance with reduced computational overhead, making it ideal for deployment on mobile and embedded systems.
Lightweight Architecture: Optimized for minimal parameter count and reduced computational complexity. Efficient Attention Mechanisms: Implements optimized attention mechanisms to maintain high performance while reducing overhead. Flexible and Modular Codebase: Easy to customize and extend for various image recognition tasks.
vision_transformer.py: The core implementation of the Light Vision Transformer model. vision_transformer_colab.ipynb: A Jupyter notebook for training the LightViT model on Google Colab.