Skip to content

Predicting image classes can now be achieved without extensive training, thanks to the advancements in transformer-based models.

Notifications You must be signed in to change notification settings

inuwamobarak/google-efficientnet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

google-efficientnet

Google EfficientNet for Image Classification

Predicting image classes can now be achieved without extensive training, thanks to the advancements in transformer-based models.

What are Transformer Models?

Introduced in the groundbreaking "Attention is All You Need" paper by Vaswani et al., transformers leverage attention mechanisms to capture complex patterns and dependencies in sequential data. Initially designed for NLP tasks, the success of transformers has inspired their application to other domains, including computer vision.

transformer_architecture

In the context of image classification, transformers leverage self-attention mechanisms to process images as sequences of patches, breaking down the image into manageable pieces. This approach allows the model to focus on relevant regions and relationships between patches, enabling it to capture intricate spatial patterns effectively.

Pre-trained EfficientNet Models

As with most transformer models, transfer learning allows us to leverage the power of EfficientNet without starting from scratch. Transfer learning involves using pre-trained models that have been trained on large-scale datasets. Google and Huggingface offer pre-trained versions of EfficientNet, which can be fine-tuned on specific image classification tasks even with relatively small datasets.

Screenshot from 2023-07-26 09-31-05

Image Classification using EfficientNet

Installing the Transformer Model

Since Google Colaboratory does not have the Transformer library pre-installed, we need to install it first:

!pip install -q datasets transformers

Loading the EfficientNet Transformer

We load the pre-trained model from Huggingface's model hub:

image_processor = AutoImageProcessor.from_pretrained("google/efficientnet-b7")
model = EfficientNetForImageClassification.from_pretrained("google/efficientnet-b7")
inputs = image_processor(image, return_tensors="pt")
with torch.no_grad():
    logits = model(**inputs).logits

The output will be the predicted class label for the image.

Reference Links

About

Predicting image classes can now be achieved without extensive training, thanks to the advancements in transformer-based models.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published