This repository contains C++ code for our CVPR 2018 paper "Active Fixation Control to Predict Saccade Sequences". Python version of the code is available here. Note that the python version uses a slightly different foveation algoritm. If you want to reproduce the results in the paper, please use the C++ version.
Given a single input image, STAR-FC will produce a sequence of fixation locations (a list of x,y coordinates).
Qualitative results on images used in Yarbus' free-viewing experiments showing (from left to right) human fixations, fixations produced by STAR-FC and SALICON. See our paper "A Focus on Selection for Fixation" for more examples.
If you find our work useful in your research, please consider citing:
@InProceedings{Wloka_CVPR18,
author = {Wloka, Calden and Kotseruba, Iuliia and Tsotsos, J. K.},
title = {Saccade Sequence Prediction: Beyond Static Saliency Maps},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2018}
}
STAR-FC requires an NVIDIA GPU with at least 5GB of memory available (tested on Titan X and GeForce GTX 1080 Ti).
Install nvidia-docker following the instructions in the official repository.
Build Docker container using the scripts provided in docker_scripts folder (may take up to 1 hour to build OpenCV 3 with CUDA support and caffe with dependencies):
sh docker_scripts/build
In the following instructions $STAR_FC_ROOT
refers to the location of STAR_FC root directory.
Clone STAR_FC repository.
Install dependencies via apt-get:
apt-get update && apt-get install \
build-essential \
autoconf \
automake \
libtool \
cmake \
curl \
libgtk2.0-dev \
pkg-config \
libavcodec-dev \
libavformat-dev \
libswscale-dev \
python3-pip \
python3-dev \
python3-setuptools \
python3-numpy \
libleveldb-dev \
libsnappy-dev \
libhdf5-serial-dev \
libgtest-dev \
libfftw3-dev \
libboost-all-dev \
libgoogle-glog-dev \
libgtest-dev \
libmatio-dev \
libatlas-base-dev \
liblmdb-dev \
libxext-dev \
libx11-dev \
x11proto-gl-dev \
dbus \
wget \
unzip \
dh-autoreconf \
pip3 install scikit-image
Download OpenCV 3.3 and build it with CUDA support enabled (necessary for running the AIM saliency algorithm in reasonable time).
wget -O opencv.zip https://github.com/opencv/opencv/archive/3.3.0.zip
unzip opencv.zip
mkdir opencv-3.3.0/build
cd opencv-3.3.0/build
cmake -DCMAKE_BUILD_TYPE=RELEASE \
-DWITH_GTK=OFF -DWITH_GTK_2_X=OFF -DWITH_QT=ON \
-DBUILD_JPEG=ON \
-DCMAKE_INSTALL_PREFIX=/usr/local \
-DINSTALL_PYTHON_EXAMPLES=OFF \
-DINSTALL_C_EXAMPLES=OFF \
-DWITH_CUDA=ON \
-DBUILD_EXAMPLES=OFF .. && make -j8 && make install
Install protobuf version 3.1 (other versions do not seem to work with SALICON):
wget -O protobuf.zip https://github.com/google/protobuf/archive/v3.1.0.zip
unzip protobuf.zip
cd protobuf-3.1.0
/autogen.sh
./configure
make
make install
ldconfig
pip3 install --user --upgrade protobuf==3.1.0
Install CUDA 8.0, CuDNN 5.0 for CUDA 8.0 (installation instructions) and Caffe.
Download and build caffe rc4 inside $STAR_FC_ROOT/contrib/caffe
. Note: Caffe MUST be built with Python3 support, OpenCV 3.0 and CuDNN (see Makefile.config provided):
wget -O caffe.zip https://github.com/BVLC/caffe/archive/rc4.zip
unzip caffe.zip
mkdir -p $STAR_FC_ROOT/contrib/caffe
mv caffe-rc4/* $STAR_FC_ROOT/contrib/caffe/
cd $STAR_FC_ROOT/contrib/caffe/
make -j8 && make pycaffe
Download library for parsing INI files and place it in $STAR_FC_ROOT/contrib/inih
:
cd $STAR_FC_ROOT/contrib
git clone https://github.com/benhoyt/inih.git inih
Download OpenSALICON and associated caffemodel files and place them into the $STAR_FC_ROOT/contrib/OpenSALICON
folder.
cd $STAR_FC_ROOT/contrib
git clone https://github.com/CLT29/OpenSALICON.git OpenSALICON
wget http://www.cs.pitt.edu/%7Echris/files/2016/model_files.tgz
tar -zxvf model_files.tgz
cp model_files/salicon_osie.caffemodel OpenSALICON/
rm -rf model_files && rm model_files.tgz
To use BMS and VOCUS2 for peripheral saliency, download code from their respective repositories and place them in contrib.
cd $STAR_FC_ROOT/contrib
RUN wget http://cs-people.bu.edu/jmzhang/BMS/BMS_v2-mex.zip && unzip BMS_v2-mex.zip -d BMS_v2
RUN git clone https://github.com/GeeeG/VOCUS2.git VOCUS2
Compile STAR-FC:
cd $STAR_FC_ROOT
cmake -DWITH_SALICON=ON .
make
Add caffe and SALICON to PYTHONPATH:
export PYTHONPATH=$STAR_FC_ROOT/contrib/caffe/python/:$STAR_FC_ROOT/contrib/OpenSALICON/:$PYTHONPATH
If STAR-FC was built using the recommended Dockerfile, use the following command:
sh docker_scripts/run -d -c <config_file> --input_dir <input_dir> --output_dir <output_dir>
-d
is an optional flag to display input image with overlayed fixations after each step.
If you built STAR-FC without Docker, use the following command:
./STAR_FC --display --configFile <config_file> --inputDir <input_dir> --outputDir <output_dir>
--display
is an optional flag to display input image with overlayed fixations after each step.
Sample config files can be found in the config_files directory, e.g. test.ini runs STAR-FC on several images from images directory and cat2k.ini contains parameters that we used for the experiments reported in the CVPR paper.
To run STAR-FC on the CAT2000 dataset first download the train images with fixations of 18 observers (http://saliency.mit.edu/trainSet.zip) and update CAT2K_ROOT
in run_cat2k.sh with the location of the dataset. Also set STAR_FC_ROOT
and OUTPUT_DIR
in the script and run it with cat2k.ini config file as the argument.
sh run_cat2k.sh config_files/cat2k.ini
We use the standard .ini config file format (https://en.wikipedia.org/wiki/INI_file)
The following parameters for STAR_FC can be set in the .ini file
Define what saliency algorithm to run in the peripheral attentional field, size of the central and peripheral fields, size of the inhibitoin of return and its decay rate.
[attention_map_params]
BUSalAlgorithm = <algorithm> (one of AIM, BMS, VOCUS)
pgain = <float> (an optional parameter to enhance response of the saliency algorithm inthe periphery, default 1.0)
blendingStrategy = <int> (different strategies for blending the central and peripheral fields: 1 (SAR), 2 (MCA) or 3 (WCA),
see details in the paper, default 1)
pSizeDeg = <float> (size of the peripheral field in degrees, default 9.5)
cSizeDeg = <float> (size of the central field in degrees, default 9.6)
iorSizeDeg = <float> (size of the inhibition of return in degrees, default 1.5)
iorDecayRate = 100 (decay rate of the inhibition of return)
Define the size of the stimuli, distance from the stimuli and maximum number of fixations to generate
Note: provide either the resolution (pix2deg) or size of the stimuli in degrees (inputSizeDeg), but not both
[viewing_params]
pix2deg = <int> (number of pixels per degree)
inputSizeDeg = <int> (size of simuli in degrees)
viewDist = <float> (distance from the stimuli in meters)
maxNumFixations = <int> (number of fixations per stimulus)
paddingR = <int> between 0 and 255 (padding color RGB values, if not set, the average of the image will be used)
paddingG = <int> between 0 and 255 (for CAT2000 dataset use (125, 125, 125)
paddingB = <int> between 0 and 255
define what output from STAR_FC should be saved and where
[log_params]
saveFix = <option> (on or off, save fixations as .mat file)
saveScreen = <option> (on or off, save screenshots of the STAR_FC app after each fixation as .png file)
fixLogName = <filename_prefix> (prefix of the fixation .mat files, default fixationList)
partNumFixations = <int> (number of fixations before saving partial results, e.g. if it's 3, fixationList3.mat, fixationList6.mat and so on will be saved)
overwrite = <option> (1 or 0, whether to overwrite the existing results for a given stimulus or skip it)
- Calden Wloka - theory, original C++ implementation for TarzaNN
- Yulia Kotseruba - current C++ version of the code
with a special thanks to Toni Kunic for technical support.
Contact either Calden (calden at cse.yorku.ca) or Yulia (yulia_k at cse.yorku.ca) should you have any questions about using or installing STAR-FC.