Hamlet is a website that connects Zooniverse data from Panoptes with external data services, e,g. sending a camera trap project's animal-filled photos to an animal-identifying machine-learning system.
Hamlet has an export feature that ties into the Zooniverse Machine Learning Subject Assistant, (app) (source) which lets project owners/researchers submit their camera trap photos to an external Machine Learning (ML) service, which in turn finds animals in those images.
The user story is as follows:
- Users start at the Subject Assistant app.
- Users are directed to Hamlet, where they choose a Subject Set to export to the external ML Service.
- Hamlet performs the export feature, and provides users with a link back to the Subject Assistant with an "ML Task ID" - e.g.
https://subject-assistant.zooniverse.org/#/tasks/6378
- Users click that link, and process the ML-tagged photos on the Subject Assistant app.
The Subject Assistant requires the following external systems:
- Machine Learning Service - in this case, powered by Microsoft.
- an Azure Storage Container - works in conjunction with the ML Service, which requires "subject manifest" files to be stored on Azure.
As of late 2022, these services are maintained by the Zooniverse team.
The Subject Assistant feature requires the following ENV variables defined:
SUBJECT_ASSISTANT_AZURE_ACCOUNT_NAME
SUBJECT_ASSISTANT_AZURE_ACCOUNT_KEY
SUBJECT_ASSISTANT_AZURE_CONTAINER_NAME
SUBJECT_ASSISTANT_ML_SERVICE_CALLER_ID
- provided by our friends in Microsoft who run the ML Service.SUBJECT_ASSISTANT_ML_SERVICE_URL
- ditto
Optionally, the following ENV variables can be defined:
SUBJECT_ASSISTANT_EXTERNAL_URL
- defaults tohttp://subject-assistant.zooniverse.org/#/tasks/
The ML Subject Assistant feature in Hamlet has two views:
GET /subject-assistant/<int:project_id>/
- lists all the Subject Sets for a Project, along with their "ML export" status and (if the export is successful) a link back to the Subject Assistant app.POST /subject-assistant/<int:project_id>/subject-sets/<int:subject_set_id>/
- performs the ML Export action for a given Subject Set, then redirects users back to the listing page.
The MLSubjectAssistantExport
table has the following fields:
- subject_set_id - the ID of the Zooniverse Subject Set that was exported to the external ML Service
- json - the "subject manifest" file, in JSON format, created from all the Subjects of the Subject Set. The format is specific to the ML Service.
- azure_url - the URL of the "subject manifest" file that was uploaded to an external Azure storage container. (See Mechanics: ML Export Action for why)
- ml_task_uuid - the task request ID or "job ID" for the ML Export action. This is generated by the external ML Service.
Mechanically, the ML Subject Assistant's "export to Microsoft" action performs the following:
- get all the Subjects for a given Subject Set (pulling from Panoptes)
- create a JSON file - the "subject manifest" - that describes the Subjects to be exported, in a format specified by the external ML Service.
- upload the JSON file to an external Azure storage container (reason: the current external ML Service only reads subject manifest files from Azure), then create a "shareable URL" to that JSON file. (Clarification: Azure uses a SAS or Shared Access Signature tokens to create shareable URLs with limited lifespans.)
- Submit the shareable URL to the ML Service, and get the "job ID" it returns.
The Job ID plus the known Subject Assistant app URL is all that's required to construct a "return URL" for the user.
Use docker & docker-compose to setup a development env.
- Run
docker-compose build
to build the app container. - Run the tests
docker-compose run -T --rm app bundle exec pytest --cov=hamlet
Alternatively you can use docker & compose to run an interactive bash shell for development and testing
- Run
docker-compose run --service-ports --rm app bash
to start the containers - Run
pytest --cov=hamlet
to run the test suite in that shell (sadly this system has no tests :sadpanda:) - Or
./start_server.sh
to run the server (see Pipfile)
I can't login on local development
Problem:
- You're able to run
docker-compose build ; docker-compose up
, and you can view Hamlet on local development onhttp://localhost:8080
- However, when you click on the "Login with Zooniverse" button and provide your details on the Panoptes login page, you
Analysis:
- It's likely that your instance of Hamlet is missing the
PANOPTES_APPLICATION_ID
andPANOPTES_SECRET
environment variables. - These env vars are required to tell Panoptes which oAuth application you're logging into.
Solution:
- Go to Panoptes's oAuth applications list, find the Hamlet app, and copy the Application ID and Secret
- Add these to your local development Docker's environment variables, as
PANOPTES_APPLICATION_ID
andPANOPTES_SECRET
- This can be done easily by creating a
.env
file in the root folder of yourhamlet
repo.
- This can be done easily by creating a
Related issue: 479
The database won't start on local development
Problem:
- When you run
docker-compose build ; docker-compose up
, you notice that the PostgreSQL database isn't running. - There's probably a few error message in the console:
app_1
will continuously complain that it's trying (and failing) to find the PostgreSQL database, whilepostgres_1
might say something about "can't initialise due to incompatible database".
Analysis:
- It's possible that your existing local PostgreSQL database (i.e. the
/postgres_data
folder) was built on an older version of PostgreSQL, and recent updates to Hamlet have upgraded the PostgreSQL that Hamlet uses, causing an incompatibility.
Solution:
- Check if you have an existing
/postgres_data
folder in your localhamlet
repo. - If yes, delete it. The next time you start Hamlet, the database will be rebuilt with the latest version.
- console:
python manage.py shell
- create_local_db:
createdb -U halmet -O hamlet hamlet
- drop_local_db:
dropdb -U hamlet hamlet
- makemigrations:
python manage.py makemigrations
- migrate
python manage.py migrate
- server:
bash -e ./start_server.sh
- tests:
pytest --cov=hamlet
- tree:
bash -c 'find . | grep -v git | grep -v cache'
- worker:
bash -c ./start_worker.sh
poetry update django
See Poetry docs for more details