Hosted AutoML Export Transformer (Hamlet)

Hamlet is a website that connects Zooniverse data from Panoptes with external data services, e,g. sending a camera trap project's animal-filled photos to an animal-identifying machine-learning system.

Auto ML Export

Subject Assistant

Hamlet has an export feature that ties into the Zooniverse Machine Learning Subject Assistant, (app) (source) which lets project owners/researchers submit their camera trap photos to an external Machine Learning (ML) service, which in turn finds animals in those images.

User Story

The user story is as follows:

Users start at the Subject Assistant app.
Users are directed to Hamlet, where they choose a Subject Set to export to the external ML Service.
Hamlet performs the export feature, and provides users with a link back to the Subject Assistant with an "ML Task ID" - e.g. https://subject-assistant.zooniverse.org/#/tasks/6378
Users click that link, and process the ML-tagged photos on the Subject Assistant app.

External Dependencies

The Subject Assistant requires the following external systems:

Machine Learning Service - in this case, powered by Microsoft.
an Azure Storage Container - works in conjunction with the ML Service, which requires "subject manifest" files to be stored on Azure.

As of late 2022, these services are maintained by the Zooniverse team.

Environmental Variables

The Subject Assistant feature requires the following ENV variables defined:

SUBJECT_ASSISTANT_AZURE_ACCOUNT_NAME
SUBJECT_ASSISTANT_AZURE_ACCOUNT_KEY
SUBJECT_ASSISTANT_AZURE_CONTAINER_NAME
SUBJECT_ASSISTANT_ML_SERVICE_CALLER_ID - provided by our friends in Microsoft who run the ML Service.
SUBJECT_ASSISTANT_ML_SERVICE_URL - ditto

Optionally, the following ENV variables can be defined:

SUBJECT_ASSISTANT_EXTERNAL_URL - defaults to http://subject-assistant.zooniverse.org/#/tasks/

Mechanics: Django Pages/Views

The ML Subject Assistant feature in Hamlet has two views:

GET /subject-assistant/<int:project_id>/ - lists all the Subject Sets for a Project, along with their "ML export" status and (if the export is successful) a link back to the Subject Assistant app.
POST /subject-assistant/<int:project_id>/subject-sets/<int:subject_set_id>/ - performs the ML Export action for a given Subject Set, then redirects users back to the listing page.

Mechanics: Database Model

The MLSubjectAssistantExport table has the following fields:

subject_set_id - the ID of the Zooniverse Subject Set that was exported to the external ML Service
json - the "subject manifest" file, in JSON format, created from all the Subjects of the Subject Set. The format is specific to the ML Service.
azure_url - the URL of the "subject manifest" file that was uploaded to an external Azure storage container. (See Mechanics: ML Export Action for why)
ml_task_uuid - the task request ID or "job ID" for the ML Export action. This is generated by the external ML Service.

Mechanics: ML Export Action

Mechanically, the ML Subject Assistant's "export to Microsoft" action performs the following:

get all the Subjects for a given Subject Set (pulling from Panoptes)
create a JSON file - the "subject manifest" - that describes the Subjects to be exported, in a format specified by the external ML Service.
upload the JSON file to an external Azure storage container (reason: the current external ML Service only reads subject manifest files from Azure), then create a "shareable URL" to that JSON file. (Clarification: Azure uses a SAS or Shared Access Signature tokens to create shareable URLs with limited lifespans.)
Submit the shareable URL to the ML Service, and get the "job ID" it returns.

The Job ID plus the known Subject Assistant app URL is all that's required to construct a "return URL" for the user.

Development

Use docker & docker-compose to setup a development env.

Run docker-compose build to build the app container.
Run the tests docker-compose run -T --rm app bundle exec pytest --cov=hamlet

Alternatively you can use docker & compose to run an interactive bash shell for development and testing

Run docker-compose run --service-ports --rm app bash to start the containers
Run pytest --cov=hamlet to run the test suite in that shell (sadly this system has no tests :sadpanda:)
Or ./start_server.sh to run the server (see Pipfile)

Troubleshooting

I can't login on local development

Problem:

You're able to run docker-compose build ; docker-compose up, and you can view Hamlet on local development on http://localhost:8080
However, when you click on the "Login with Zooniverse" button and provide your details on the Panoptes login page, you

Analysis:

It's likely that your instance of Hamlet is missing the PANOPTES_APPLICATION_ID and PANOPTES_SECRET environment variables.
These env vars are required to tell Panoptes which oAuth application you're logging into.

Solution:

Go to Panoptes's oAuth applications list, find the Hamlet app, and copy the Application ID and Secret
Add these to your local development Docker's environment variables, as PANOPTES_APPLICATION_ID and PANOPTES_SECRET
- This can be done easily by creating a .env file in the root folder of your hamlet repo.

Related issue: 479

The database won't start on local development

Problem:

When you run docker-compose build ; docker-compose up, you notice that the PostgreSQL database isn't running.
There's probably a few error message in the console: app_1 will continuously complain that it's trying (and failing) to find the PostgreSQL database, while postgres_1 might say something about "can't initialise due to incompatible database".

Analysis:

It's possible that your existing local PostgreSQL database (i.e. the /postgres_data folder) was built on an older version of PostgreSQL, and recent updates to Hamlet have upgraded the PostgreSQL that Hamlet uses, causing an incompatibility.

Solution:

Check if you have an existing /postgres_data folder in your local hamlet repo.
If yes, delete it. The next time you start Hamlet, the database will be rebuilt with the latest version.

Useful application scripts

console: python manage.py shell
create_local_db: createdb -U halmet -O hamlet hamlet
drop_local_db: dropdb -U hamlet hamlet
makemigrations: python manage.py makemigrations
migrate python manage.py migrate
server: bash -e ./start_server.sh
tests: pytest --cov=hamlet
tree: bash -c 'find . | grep -v git | grep -v cache'
worker: bash -c ./start_worker.sh

Updating a package with peotry

poetry update django

See Poetry docs for more details

Name		Name	Last commit message	Last commit date
Latest commit History 447 Commits
.github		.github
exports		exports
hamlet		hamlet
kubernetes		kubernetes
static		static
templates		templates
.dockerignore		.dockerignore
.env		.env
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
manage.py		manage.py
poetry.lock		poetry.lock
poetry.toml		poetry.toml
pyproject.toml		pyproject.toml
start_server.sh		start_server.sh
start_worker.sh		start_worker.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hosted AutoML Export Transformer (Hamlet)

Auto ML Export

Subject Assistant

User Story

External Dependencies

Environmental Variables

Mechanics: Django Pages/Views

Mechanics: Database Model

Mechanics: ML Export Action

Development

Troubleshooting

Useful application scripts

Updating a package with peotry

About

Releases

Packages

Contributors 8

Languages

License

zooniverse/hamlet

Folders and files

Latest commit

History

Repository files navigation

Hosted AutoML Export Transformer (Hamlet)

Auto ML Export

Subject Assistant

User Story

External Dependencies

Environmental Variables

Mechanics: Django Pages/Views

Mechanics: Database Model

Mechanics: ML Export Action

Development

Troubleshooting

Useful application scripts

Updating a package with peotry

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 8

Languages

Packages