This tool allows to make free-hand image annotations, useful in tasks such as gathering ground-truth segmentation masks or learning human salient areas. It was initially designed and implemented by Lucas Parzianello for his work on fractional iris recognition. It has been extended later to research on iris presentation attack detection and human salience-guided training.
- Dockerized application for simple deployment
- PostgreSQL DB <=> Django + Gunicorn + Nginx web server <= REST API => Vue-based SPA + Vuex
- Django Whitenoise to serve static files, CDN Ready
- Annotations stored in relational database
- Access control / user management
- Vuex handles state management and persistance to never lose annotations on the front-end
Before getting started you should have the following installed and running:
- Docker >= v19
- Docker Compose >= v1.25
Data upload via web interface if not possible yet, so the data needs to be mounted inside the container.
If you have the images in the same machine, just put them in the expected location data/dataset/
by creating a symbolic link (below) or just moving your data.
ln -s $MY_DATASET_LOCATION $(pwd)/data/dataset
Your data/dataset/
directory should contain:
- A directory named
images
with all the image files. - A file named
metadata.csv
with the corresponding metadata.
If your dataset is remote (cloud or another computer), you might want to start using dvc
. Check the Integrating DVC session below.
# copy all example dotenv files
sudo apt install mmv
mmv -c 'env/*.env.example' 'env/#1.env'
# edit all env/*.env files setting the following:
# DJANGO_STATIC_HOST
# SECRET_KEY
# DB_PASS
# POSTGRES_PASSWORD (same as DB_PASS)
find env -name "*.env" -exec nano {} \;
# create the public network
docker network create net-nginx-proxy
# build docker images and run containers
docker-compose up
# from another terminal, run the database migrations
docker-compose exec web pipenv run /app/manage.py makemigrations
docker-compose exec web pipenv run /app/manage.py migrate
# create django superuser
docker-compose exec web pipenv run /app/manage.py createsuperuser
# access localhost:80 in your browser
docker-compose exec web /bin/bash
docker-compose exec nginx /bin/sh
docker-compose exec db psql --username eyetagger_admin --dbname eyetagger
More PostgreSQL commands:
\h # help
\q # quit
\l # list databases
\d # list tables / relations
\d api_annotation # describe a table / relation
# run a query - don't forget the semicolon:
SELECT id, annotator_id, image_id FROM api_annotation;
Feature | Default location | Comment |
---|---|---|
Django REST Framework | http://localhost/api | Only available in development mode (i.e. DEBUG=True in env/django_app.env ) |
Django Administration Panel | http://localhost/api/admin | Credentials created with pipenv run ./manage.py createsuperuser |
Location from project root | Contents |
---|---|
backend/ |
Django Project & Backend Config |
backend/api/ |
Django App for REST api |
data/ |
Git-ignored: DB + backups |
deploy/ |
Scripts and configuration files |
dist/ |
Git-ignored: back+front generated files |
env/ |
Environment Files |
public/ |
Static Assets |
src/ |
Vue App |
To run it once:
# docker-compose up db # if db container is not running
docker-compose exec db pg_dump -U eyetagger_admin eyetagger | \
gzip > eyetagger_bkp_$(date +"%Y_%m_%d_%I_%M_%p").sql.gz
Check backups.sh for a simple automated version.
Tip: you can add the existing
backups.sh
to yourcrontab -e
for periodic backups:
To run it every 6 hours:
0 */6 * * * /eyetagger/backups.sh >> /eyetagger/data/logs/backups.log 2>&1
Or every business day (Mon-Fri) at 6pm:
0 18 * * 1-5 /eyetagger/backups.sh >> /eyetagger/data/logs/backups.log 2>&1
# replace $YOUR_DUMP_GZ by your .gz location:
# let's copy the backup before moving/modifying it
cp $YOUR_DUMP_GZ /tmp/dump.sql.gz
# extract the dump
gunzip -k /tmp/dump.sql.gz
# copy to the running DB container
# docker-compose up db # if db container is not running
docker cp /tmp/dump.sql eyetagger_db_1:/dump.sql
# create a new empty database
docker-compose exec db createdb -U eyetagger_admin -T template0 eyetagger_new
# populate the empty database with the dump
docker-compose exec db psql -U eyetagger_admin -d eyetagger_new -f /dump.sql
# swap database names
docker-compose exec db psql --username eyetagger_admin --dbname postgres
\l
ALTER DATABASE eyetagger RENAME TO eyetagger_old;
ALTER DATABASE eyetagger_new RENAME TO eyetagger;
\l
\q
# get the other services up and try it out!
docker-compose down && docker-compose up
# if successful, clean the temporary backup copies
rm /tmp/dump.sql.gz /tmp/dump.sql
-
There are 2 entries
command
underdocker-compose.yaml
> Serviceweb
. Select the "development" one by commenting out the alternative. -
Run
docker-compose up
(rundown
first if already up) and openlocalhost:9000
. Hot reload should be enabled i.e. live changes to the front-end code will update the browser.
- Adapt the environment files for the backend in
env/
. - Adapt the environment file for the frontend in
vue.config.js
. - Follow the Django deployment checklist for further configuration.
- Deploy the dockerized application in a remote server by running it in daemon form:
docker-compose up -d && docker-compose logs -f
.
-
Install
dvc
on hostpip install dvc
-
Setup access (using a GCP below)
# get provider-specific api pip install 'dvc[gs]' # create google bucket credentials mkdir -p $HOME/.gcp/ GOOGLE_APPLICATION_CREDENTIALS=$HOME/.gcp/iris-admin.json # paste the contents of the GCP JSON in this file # see https://cloud.google.com/docs/authentication/getting-started" nano $GOOGLE_APPLICATION_CREDENTIALS chmod 400 $GOOGLE_APPLICATION_CREDENTIALS export GOOGLE_APPLICATION_CREDENTIALS echo -e ' >> Add this to your ~/.bashrc:\n\n\ export GOOGLE_APPLICATION_CREDENTIALS='$GOOGLE_APPLICATION_CREDENTIALS'\n\n
-
Then get your data from the remote.
dvc pull
Or add new data to the bucket
dvc add data/dataset && dvc push
Eyetagger handles two types of data: the images - referred to as the dataset, and the metadata - stored in a relational database / db using PostgreSQL.
As described previously, the data/dataset
location expects a directory images
that contains all images in your dataset and a file metadata.csv
that stores information about the images in your dataset. Below is a directory tree illustrating the expected dataset setup:
data/dataset
├── metadata.csv
└── images
├── image1.png
├── image2.png
└── ...
Metadata is necessary to keep track of the annotations, who did them, when, and any other data attribute that might be useful for the annotation workload. The dataset is usually a set of images to be displayed during the annotation process. Below is an example of what your metadata.csv
file may look like:
filename,ground_truth
image1.png,real
image2.png,spoof
...
In order to serve a custom dataset, you will need to first A. run the app creating a database (steps 1.1-1.4 above) and then B. create the metadata entries for your dataset in PostgreSQL.
Below we describe how to do this part B by using a database migration:
-
Create a migration.
The metadata entries are created by running one or more database migrations. Let's create an empty one with:
# this assumes your containers are up, make sure to run docker-compose up first # below and onwards, "api" is the internal name of the Django app that we are working with docker-compose exec web pipenv run /app/manage.py makemigrations api --name dataset_import --empty
After this command will have a new Python file in the migrations' directory (e.g.
backend/api/migrations/####_dataset_import.py
). -
Call a new and customized migration script to ingest your dataset's metadata into the relational DB.
Change that created file to import your custom script as follows:
from backend.api.manual_migration import import_dataset # down in the Migration class, paste the following: class Migration(migrations.Migration): # ... initial = True # import_dataset is the function that will be called when you run the migration # reverse_code is the function that will be called when you rollback the migration, using a "no-op" function below operations = [ migrations.RunPython(import_dataset, reverse_code=migrations.RunPython.noop) ] # ...
-
Customize this migration script and ORM models to match your dataset.
- An example of a migration script can be found in
backend/api/manual_migration.py
- you can use this as a template for your own script. - All SQL code and database transactions are handled by Django's ORM, so you don't need to know SQL to populate the database.
- The existing migration script loads a CSV file that contains metadata for each image. Because each dataset is unique, yours might have different attributes.
- The
import_dataset
function in that script loads this CSV, creates all ORM objects (e.g. theimg
variable), and saves them to the databaseimg.save()
. The other functions help with this process. - Change the Image model:
- Modify
backend/api/models.py
to fit your needs. - Run
/app/manage.py makemigrations
- this compares model.py to the database, if their schemas differ it'll generate code that describes a new migration. - Run
/app/manage.py migrate
to "run" the necessary migrations, effectively updating the database. Django keeps track of the migrations that were run.
- Modify
⚠️ The attributes of yourImage
model should be close to the columns in your CSV file. If you try to store an ORM object that deviates from the table schema, the database transaction will fail.
- An example of a migration script can be found in
-
Run the migrations.
Only the necessary (new) migrations will be run with the following command:
docker-compose exec web pipenv run ./manage.py migrate
💡 After you create (and save) some entries like
Image
objects, you will be able to see them in the Django admin panel (see dashboards above). -
Troubleshooting: when a migration goes wrong.
Errors might happen if the migration script is not correct. If so, you can reverse it with:
# change 0001 below docker-compose exec web pipenv run ./manage.py migrate api 0001
Where
0001
is the number of the previous migration (i.e. the number####
inbackend/api/migrations/####_migration_name.py
).Another way is to reset them all: see scenario 2 in this guide, our "app name" is
api
.A note about migrations that change schemas: if a migration modifies the database schema, make sure your rollback function also undoes those changes. For example, if migration
N
adds a new column to theImage
model, and you roll back toN-1
, this roll back function should also remove that column from theImage
model. Otherwise, when you runN
again, Django will try to create a column that already exists, which will fail. Because of this rollback complication, I chose to separate migrations that change the database schema (e.g. creating tables, modifying attributes) from migrations that populate the database with data (e.g. the one inmanual_migration.py
).Above are the best ways to fix migration issues and avoid corruption or data loss. But if losing data is not an issue, you can also delete the database and start over, for example:
# ⚠️ this will cause data loss docker-compose exec db dropdb -U eyetagger_admin eyetagger docker-compose exec db createdb -U eyetagger_admin eyetagger docker-compose exec web pipenv run /app/manage.py migrate
- Initial version: EyeTagger by Lucas Parzianello
- Extensions and adaptations for the Trusted AI Framework: Anna VanAvermaete and Priscila Saboia
- Documentation from Anna VanAvermaete
- The adaptations and extensions of the tool made by Priscila Saboia were supported by the U.S. Department of Defense (Contract No. W52P1J2093009). The adaptations and extensions of the tool made by Anna VanAvermaete were supported partially by the U.S. Department of Defense (Contract No. W52P1J2093009) and partially by the National Science Foundation (award No. 2237880). The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the U.S. Department of Defense, the National Science Foundation or the U.S. Government.
- Collection of SVG data and individual annotation times: Samuel Webster