Original VinVL visual backbone with simplified APIs to easily extract features, boxes, object detections in a few lines of code. This repo is based on microsoft/scene_graph_benchmark please refer that repo for further info about the benchmark
pip install git+https://github.com/Mahmood-Anaam/VinVL.git
or
Clone the repository and install VinVL in editable mode:
!git clone https://github.com/Mahmood-Anaam/VinVL.git
%cd VinVL
!pip install -e .
Create conda environment for GPU:
conda env create -f environment.yml
conda activate sg_benchmark
!git clone https://github.com/Mahmood-Anaam/VinVL.git
%cd VinVL
!pip install -e .
from PIL import Image
import requests
from vinvl.scene_graph_benchmark.wrappers import VinVLVisualBackbone
img_url = ""http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(img_url, stream=True).raw)
# image # (file path, URL, PIL.Image, numpy array, or tensor)
image_features = feature_extractor(image)
# return List[dict]: List of extracted features for each image.
# [{"boxes","classes","scores","img_feats","spatial_features"},]
# for batch
batch = [
"http://images.cocodataset.org/val2017/000000039769.jpg",
getimage("https://farm1.staticflickr.com/26/53573290_1d167223e8_z.jpg")
]
batch_features = feature_extractor(batch)
for feature in batch_features:
print("\n",feature['classes'])