The Global Alliance for Genomics and Health is an international coalition, formed to enable the sharing of genomic and clinical data.
The Cloud Work Stream helps the genomics and health communities take full advantage of modern cloud environments. Our initial focus is on “bringing the algorithms to the data”, by creating standards for defining, sharing, and executing portable workflows.
We work with platform development partners and industry leaders to develop standards that will facilitate interoperability.
The Task Execution Service (TES) API is an effort to define a standardized schema and API for describing batch execution tasks. A task defines a set of input files, a set of (Docker) containers and commands to run, a set of output files, and some other logging and metadata.
See the human-readable Reference Documentation and the OpenAPI YAML description. You can also explore the specification in the Swagger Editor.
All documentation and pages hosted at 'ga4gh.github.io/task-execution-schemas' reflect the latest API release from the
master
branch. To monitor the latest development work, add 'preview/<branch>' to the URLs above (e.g., 'ga4gh.github.io/task-execution-schemas/preview/<branch>/docs').
A stand-alone security review has been performed on the API. Nevertheless, any implementation that is linked to from the documentation accompanying the API is done so without any security guarantees. If you integrate this code into your application it is AT YOUR OWN RISK AND RESPONSIBILITY to arrange for an audit to ensure compliance with any applicable regulatory and security requirements, especially where personal data may be at issue.
The schema and APIs is defined here in Open Api Specification 3.0.1. Clients may use JSON and REST to communicate with a service implementing the TES API.
Here's an example of a complete task message, defining a task which calculates an MD5 checksum on an input file and uploads the output:
{
"name": "MD5 example",
"description": "Task which runs md5sum on the input file.",
"tags": {
"custom-tag": "tag-value"
},
"inputs": [
{
"name": "infile",
"description": "md5sum input file",
"url": "/path/to/input_file",
"path": "/container/input",
"type": "FILE"
}
],
"outputs" : [
{
"url" : "/path/to/output_file",
"path" : "/container/output"
}
],
"resources" : {
"cpuCores": 1,
"ramGb": 1.0,
"diskGb": 100.0,
"preemptible": false
},
"executors" : [
{
"image" : "ubuntu",
"command" : ["md5sum", "/container/input"],
"stdout" : "/container/output",
"stderr" : "/container/stderr",
"workdir": "/tmp"
}
]
}
A minimal version of the same task, including only the required fields looks like:
{
"inputs": [
{
"url": "/path/to/input_file",
"path": "/container/input"
}
],
"outputs" : [
{
"url" : "/path/to/output_file",
"path" : "/container/output"
}
],
"executors" : [
{
"image" : "ubuntu",
"command" : ["md5sum", "/container/input"],
"stdout" : "/container/output"
}
]
}
To create the task, send an HTTP POST request:
POST /v1/tasks
{ "id": "task-1234" }
The return value is a task ID.
To get a task by ID:
GET /v1/tasks/task-1234
{ "id": "task-1234", "state": "RUNNING" }
The return value will be a minimal description of the task state.
To get more information, you can change the task view using the view
URL query parameter.
The basic
view will include all task fields except a few which might be
large strings (stdout/err/system logging, input parameter contents).
GET /v1/tasks/task-1234?view=BASIC
{ "id": "task-1234", "state": "RUNNING", "name": "MD5 example", etc... }
The full
view includes stdout/err/system logs and full input parameters:
GET /v1/tasks/task-1234?view=FULL
{ "id": "task-1234", "state": "RUNNING", "name": "MD5 example",
"logs": [{ "stdout": "stdout content..." }], etc... }
To list tasks:
GET /v1/tasks
{ "tasks": [{ "id": "task-1234", "state": "RUNNING"}, etc...] }
Similar to getting a task by ID, you may change the task view:
GET /v1/tasks?view=BASIC
To cancel a task, send an HTTP POST to the cancel endpoint:
POST /v1/tasks/task-1234:cancel
- Integrate with GA4GH DRS to resolve input data source (possibly support for DRS URIs as permissible values of input URLs).
- Integrate with GA4GH TRS to resolve container images (possibly support for TRS URIs as permissible values of executor image names).
See CONTRIBUTING.md.
If a security issue is identified with the specification, please send an email to security-notification@ga4gh.org detailing your concerns.