The noagarcia/context-art-classification is borrowed to predict artistic attributes.
The jwyang/faster-rcnn is borrowed to detect visual objects.
The facebookresearch/DrQA is responsible for knowledge retrieval based on the predicted and detected visual concept.
You should prepare these requirements, pre-trained models, and the database etc., according to their source repos.
Then,
python prepare_visual_concept.py
bash run_retrieve.sh