Project with the main focus on analysing Conceptual 12m dataset images using bottom-up-attention and to counting similarity procentage for given labels from bottom-up-attention and labels from the dataset. The main goal of is to show how precise the Faster R-CNN with ResNet-101 could find objects and there attributes in given dataset.
To run the project you need to open the Main_Script.ipynb and go along with description.