Official Code Repository for the paper Incorporating Relative Object Positioning for Image Captioning
This Repository will contain code for creating Relative Positioned Features for images and creation and evaluation of the AoANet+VC+RPF model.
All instructions assume you are running from RPF directory.
We recommend installing conda to setup a separate environment as this involves specific versions of certain libraries.
For Linux and Mac
Open a terminal window in RPF directory.
sh src/setup/install.sh "rpf"
This will create an environment called rpf(can be changed by replacing rpf with desired name) for this project and will install the required libraries for the project.
For Windows
Open a command prompt window in RPF directory.
conda create --name rpf
conda activate rpf
This will create an environment called rpf(can be changed by replacing rpf with desired name) for this project.
python src/setup/install.py
We used the Karpathy split of the MS-COCO 2014 Dataset which can be downloaded from here or by running download-coco.py by running the below code, but we recommend running the corresponding cell in ipython notebook (RPF_train_colab.ipynb) if using colab. This will download the fll dataset but the code will use only the Karpathy split.
python3 src/setup/download-coco.py