diff --git a/.github/workflows/build.yml b/.github/workflows/build.yml
new file mode 100644
index 0000000..46e6de3
--- /dev/null
+++ b/.github/workflows/build.yml
@@ -0,0 +1,28 @@
+name: Documentation Build Check
+
+on:
+ pull_request:
+ branches:
+ - main
+
+jobs:
+ build-docs:
+ runs-on: ubuntu-latest
+
+ steps:
+ - name: Check out the repo
+ uses: actions/checkout@v3
+
+ - name: Set up Python
+ uses: actions/setup-python@v5
+ with:
+ python-version: '3.x'
+
+ - name: Install dependencies
+ run: |
+ python -m pip install --upgrade pip
+ pip install -r ./requirements.txt
+
+ - name: Build the Documentation
+ run: |
+ make html
diff --git a/README.md b/README.md
index cf6dd70..3dc8b16 100644
--- a/README.md
+++ b/README.md
@@ -4,9 +4,20 @@
The repository is used to provide detailed documentation on how to use the [APE library](https://github.com/sanctuuary/APE). A running instance available [here](https://ape-framework.readthedocs.io/en/latest/?badge=latest).
-## Building
+## How to build the documentation locally
-Building these docs requires [Sphinx](https://www.sphinx-doc.org/en/master/index.html).
+1. Clone the repository and navigate to the root directory
-To build the docs yourself, run `make` to see a list of possible output targets.
-For example, run `make html` to make standalone HTML files.
+2. Install the required packages
+
+ ```bash
+ pip install -r ./requirements.txt
+ ```
+
+3. Build the documentation
+
+ ```bash
+ make html
+ ```
+
+4. The documentation will be available in the `./_build/html` directory. Open the `index.html` file in your browser to view the documentation.
diff --git a/docs/ape-web/introduction.rst b/docs/ape-web/introduction.rst
index 187a7d9..7d43ccd 100644
--- a/docs/ape-web/introduction.rst
+++ b/docs/ape-web/introduction.rst
@@ -1,17 +1,14 @@
Introduction to APE Web
=======================
-About APE Web
-^^^^^^^^^^^^^
-APE Web is a website build around APE to provide a user-friendly interface for using APE.
+`APE Web `_ is a web interface build around APE to provide a user-friendly interface for using APE.
It allows users to set up and share domains, and run APE to explore workflows online.
The inputs, outputs and constraints for workflows can easily be configured via the interface.
Additional tools are also included, such as a visual constraint sketcher and workflow comparer.
+APE Web is available under the Apache 2.0 license.
-License
-^^^^^^^
-APE Web is licensed under the Apache 2.0 license.
+.. note:: This project is not actively maintained. An alternative is the `Workflomics platform `_ which uses APE to generate candidate workflows and takes it a step further by providing a platform for benchmarking the generated workflows.
Dependencies
^^^^^^^^^^^^
diff --git a/docs/basics/ape-introduction.rst b/docs/basics/ape-introduction.rst
index 2d19e2d..6722fc7 100644
--- a/docs/basics/ape-introduction.rst
+++ b/docs/basics/ape-introduction.rst
@@ -2,14 +2,14 @@ Introduction to APE
===================
About APE
-^^^^^^^^^
+---------
.. image:: ../../img/logo.png
:width: 200px
:alt: APE logo
:align: left
-APE (Automated Pipeline Explorer) (see `GitHub`_) is a library (available as CLI, Java API and a RESTful API) for the automated exploration of possible computational
+`APE (Automated Pipeline Explorer) `_ is a library (available as CLI, Java API and a RESTful API) for the automated exploration of possible computational
pipelines (scientific workflows) from large collections of computational tools.
APE relies on a semantic domain model that includes tool and type taxonomies as controlled
@@ -31,13 +31,16 @@ For our paper at ICCS 2020 [2]_ we created a video that explains APE in 5 minute
|
-.. APE in practice::
- Our `use cases <../demo/imagemagick.html>`_ are motivated by practical
- problems in various domains (e.g. bioinformatics, GIS [3]_).
- In bioinformatics, the `Workflomics`_ platform for creating and benchmarking workflows uses APE (specifically APE's RESTfull API) to generate candidate workflows.
+APE in practice
+----------------
+
+Our `use cases <../demo/demo-overview.html>`_ are motivated by practical
+problems in various domains (e.g. bioinformatics [3]_, GIS [4]_).
+In bioinformatics, the `Workflomics `_ platform for creating and benchmarking workflows uses APE (specifically APE's RESTfull API) to generate candidate workflows.
+
Credits
-^^^^^^^
+-------
APE has been inspired by the `Loose Programming framework PROPHETS `_.
It uses similar mechanisms for semantic domain modeling, workflow specification and synthesis, but strives to provide the automated
exploration and composition functionality independent from a concrete workflow system.
@@ -45,11 +48,11 @@ exploration and composition functionality independent from a concrete workflow s
We thank our brave first-generation users for their patience and constructive feedback that helped us to get APE into shape.
License
-^^^^^^^
+-------
APE is licensed under the `Apache 2.0 `_ license.
Maven dependencies
-^^^^^^^^^^^^^^^^^^
+------------------
1. `OWL API `_ - LGPL or Apache 2.0
2. `SAT4J `_ - EPL or GNU LGPL
3. `apache-common-lang `_ - Apache 2.0
@@ -62,7 +65,7 @@ Maven dependencies
10. `SnakeYAML `_ - Apache 2.0
Contributors
-^^^^^^^^^^^^
+------------
* Vedran Kasalica (v.kasalica[at]esciencecenter.nl), lead research software developer
* Maurin Voshol, student developer
* Koen Haverkort, student developer
@@ -70,7 +73,7 @@ Contributors
* Anna-Lena Lamprecht (anna-lena.lamprecht[at]uni-potsdam.de), project initiator and principal investigator
References
-^^^^^^^^^^
+----------
.. [1] Kasalica, V., & Lamprecht, A.-L. (2020).
Workflow Discovery with Semantic Constraints:
The SAT-Based Implementation of APE. Electronic Communications of the EASST, 78(0).
@@ -81,7 +84,12 @@ References
ICCS 2020. ICCS 2020. Lecture Notes in Computer Science, vol 12143. Springer,
https://doi.org/10.1007/978-3-030-50436-6_34
-.. [3] Kasalica, V., & Lamprecht, A.-L. (2019).
+.. [3] Kasalica, V., Schwammle, V., Palmblad, M., Ison, J., & Lamprecht, A. L. (2021).
+ APE in the Wild: Automated Exploration of Proteomics Workflows in the bio. tools Registry.
+ Journal of proteome research, 20(4), 2157-2165.
+
+
+.. [4] Kasalica, V., & Lamprecht, A.-L. (2019).
Workflow discovery through semantic constraints: A geovisualization case study.
In Computational science and its applications – ICCSA 2019
(pp. 473–488), Springer International Publishing,
diff --git a/docs/basics/gettingstarted.rst b/docs/basics/gettingstarted.rst
index 1fd8d18..a8b6e52 100644
--- a/docs/basics/gettingstarted.rst
+++ b/docs/basics/gettingstarted.rst
@@ -1,18 +1,22 @@
-Getting Started with APE
-========================
+My first APE run
+================
Automated workflow composition with APE can be performed through its
-command line interface (`CLI <../specifications/cli.html>`_) or its application programming interface
-(`API <../specifications/java.html>`_). While the CLI provides a simple means to interact and experiment
-with the framework, the API provides more flexibility and control over
+command line interface (`CLI <../specifications/cli.html>`_), its application programming interface
+(`java API <../specifications/java.html>`_) or the RESTful API (`RESTful APE <../restful-ape/introduction.html>`_).
+The CLI provides a simple means to interact and experiment
+with the framework, while the java API provides more flexibility and control over
the synthesis process. It can be used to integrate APE’s functionality
-into other systems.
+into other systems. Finally, the RESTful API provides an interface for users to
+engage with APE's automated pipeline exploration capabilities through HTTP requests. RESTful API is also provided as a Docker container for easy deployment and management.
My first APE run
^^^^^^^^^^^^^^^^
+The first APE run is demonstrated using the CLI. The following steps guide you through the process of setting up and running APE on a simple example.
+
Get the latest version of `APE_UseCases `_
-by either `downloading the master `_
+by either `downloading the main `_
(zip) or cloning the repository:
.. .. code-block:: shell
diff --git a/docs/demo/geo_gmt/geo_gmt.rst b/docs/demo/geo_gmt/geo_gmt.rst
index d9fc35b..e24badb 100644
--- a/docs/demo/geo_gmt/geo_gmt.rst
+++ b/docs/demo/geo_gmt/geo_gmt.rst
@@ -52,15 +52,20 @@ you could run this demo by executing the following command:
.. code-block:: shell
cd ~/git/APE_UseCases
- java -jar APE-.jar GeoGMT/E0/ape.configuration
+ java -jar APE-.jar GeoGMT/E0/config.json
-The results of the synthesis would be:
+.. note::
+ In case the execution fails due to the heap space,
+
+The results of the synthesis would be stored under the directory
+specified in the configuration file (``solutions_dir_path`` parameter). The results of the synthesis would be:
.. code-block:: shell
- GeoGMT/E0/solutions.txt - First 100 candidate solutions in textual format
- GeoGMT/E0/Figures/ - Data-flow figures corresponding to the first 10 solutions
- GeoGMT/E0/Executables/ - Executable shell scripts corresponding to the first 6 solutions
+ solutions_dir_path/solutions.txt - First X candidate solutions in textual format, where X is the number of solutions specified in the config file (``solutions`` parameter)
+ solutions_dir_path/Figures/ - Workflow figures corresponding to the first Y solutions, where Y is the number of solutions specified in the config file (``number_of_generated_graphs`` parameter, 0 if not specified))
+ solutions_dir_path/Executables/ - Executable shell scripts corresponding to the first Z solution, where Z is the number of solutions specified in the config file (``number_of_execution_scripts`` parameter, 0 if not specified))
+ solutions_dir_path/CWL/ - CWL files corresponding to the first Q solution, where Q is the number of solutions specified in the config file (``number_of_cwl_files`` parameter, 0 if not specified)
E1 - Additional Constraints
@@ -68,10 +73,11 @@ E1 - Additional Constraints
By adding more constraints (``constraints_e1.json``), we avoid obtaining workflows that are ambiguous, redundant, or not relevant to
the domain [1]_.
-.. code-block:: shell
+.. code-block:: shell uses: actions/checkout@v2
+
cd ~/git/APE_UseCases
- java -jar APE-.jar GeoGMT/E1/ape.configuration
+ java -jar APE-.jar GeoGMT/E1/config.json
Domain Model
^^^^^^^^^^^^
diff --git a/docs/demo/imagemagick/imagemagick.rst b/docs/demo/imagemagick/imagemagick.rst
index c795ee0..af5b9d3 100644
--- a/docs/demo/imagemagick/imagemagick.rst
+++ b/docs/demo/imagemagick/imagemagick.rst
@@ -36,13 +36,15 @@ you could run this demo by executing the following command:
cd ~/git/APE_UseCases
java -jar APE-2.3.0-executable.jar ImageMagick/Example1/config.json
-The results of the synthesis would be:
+The results of the synthesis would be stored under the directory
+specified in the configuration file (``solutions_dir_path`` parameter). The results of the synthesis would be:
.. code-block:: shell
- ImageMagick/Example1/solutions.txt - First 100 candidate solutions in textual format
- ImageMagick/Example1/Figures/ - Data-flow figures corresponding to the first solution (config.json specifies that only 1 solution should be found)
- ImageMagick/Example1/Executables/ - Executable shell scripts corresponding to the first solution
+ solutions_dir_path/solutions.txt - First X candidate solutions in textual format, where X is the number of solutions specified in the config file (``solutions`` parameter)
+ solutions_dir_path/Figures/ - Workflow figures corresponding to the first Y solutions, where Y is the number of solutions specified in the config file (``number_of_generated_graphs`` parameter, 0 if not specified))
+ solutions_dir_path/Executables/ - Executable shell scripts corresponding to the first Z solution, where Z is the number of solutions specified in the config file (``number_of_execution_scripts`` parameter, 0 if not specified))
+ solutions_dir_path/CWL/ - CWL files corresponding to the first Q solution, where Q is the number of solutions specified in the config file (``number_of_cwl_files`` parameter, 0 if not specified)
Domain Model
^^^^^^^^^^^^
diff --git a/docs/demo/massspectrometry/massspectrometry.rst b/docs/demo/massspectrometry/massspectrometry.rst
index e150253..54f7c44 100644
--- a/docs/demo/massspectrometry/massspectrometry.rst
+++ b/docs/demo/massspectrometry/massspectrometry.rst
@@ -34,15 +34,17 @@ you could run this demo by executing the following command:
.. code-block:: shell
cd ~/git/APE_UseCases
- java -jar APE-2.3.0-executable.jar MassSpectometry/No1/ape.configuration
+ java -jar APE-2.3.0-executable.jar MassSpectometry/No1/config.json
-The results of the synthesis would be:
+The results of the synthesis would be stored under the directory
+specified in the configuration file (``solutions_dir_path`` parameter). The results of the synthesis would be:
.. code-block:: shell
- MassSpectometry/No1/solutions.txt - First 100 candidate solutions in textual format
- MassSpectometry/No1/Figures/ - Data-flow figures corresponding to the first 10 solutions
- MassSpectometry/No1/Executables/ - Executable shell scripts corresponding to the first 6 solutions
+ solutions_dir_path/solutions.txt - First X candidate solutions in textual format, where X is the number of solutions specified in the config file (``solutions`` parameter)
+ solutions_dir_path/Figures/ - Workflow figures corresponding to the first Y solutions, where Y is the number of solutions specified in the config file (``number_of_generated_graphs`` parameter, 0 if not specified))
+ solutions_dir_path/Executables/ - Executable shell scripts corresponding to the first Z solution, where Z is the number of solutions specified in the config file (``number_of_execution_scripts`` parameter, 0 if not specified))
+ solutions_dir_path/CWL/ - CWL files corresponding to the first Q solution, where Q is the number of solutions specified in the config file (``number_of_cwl_files`` parameter, 0 if not specified)
.. [1] Magnus Palmblad, Anna-Lena Lamprecht, Jon Ison, Veit Schwämmle,
Automated workflow composition in mass spectrometry-based proteomics,
diff --git a/docs/developers/SLTLx_structure.png b/docs/developers/SLTLx_structure.png
new file mode 100644
index 0000000..d3e0960
Binary files /dev/null and b/docs/developers/SLTLx_structure.png differ
diff --git a/docs/specifications/cli.rst b/docs/developers/cli.rst
similarity index 100%
rename from docs/specifications/cli.rst
rename to docs/developers/cli.rst
diff --git a/docs/developers/developers.rst b/docs/developers/developers.rst
new file mode 100644
index 0000000..82ed482
--- /dev/null
+++ b/docs/developers/developers.rst
@@ -0,0 +1,30 @@
+SAT solving
+===========
+
+APE uses Mini SAT solver to generate workflows, by relying on the SAT4J library. It creates a CNF (Conjunctive Normal Form) file the given length and gives it to the solver to find the first solution. Other solutions on the same length are found by adding a clause to the CNF file that excludes the previous solution (by adding the negations of the previous solution to the specification). This process is repeated until the desired number of solutions is found. In case not enough solutions are found on the given length, the length is increased and the process is repeated, by generating a new CNF file that encodes the problem on the new length.
+
+The CNF encoding of the problem is used only internally, and comprises a set of clauses expressed as arrays of integers (each state statement is mapped to a positive integer, where negation is expressed as a negative value). However, the user can get access to the CNF file mapped back to the human-readable form by using ``APEDomainSetup.localCNF(path)`` method, and providing the path where the file should be stored. This will write in a file the last CNF encoding that was used in the SAT solver.
+
+.. code-block:: java
+
+ APE apeFramework = new APE(config);
+ apeFramework.getDomainSetup().setWriteLocalCNF("/home/cnf_encoding.txt");
+
+Upon execution, the file will be created in the given path, and the user can inspect the CNF encoding of the problem. The following is an example snippet of the CNF encoding of the problem:
+
+
+.. code-block::
+
+ -http://edamontology.org/Artic(Tool1) empty(Out1.2)
+ -http://edamontology.org/Comet(Tool1) &APE_label&Data&TSV&(In0.0)
+ -http://edamontology.org/halvade_somatic(Tool1) empty(Out1.1)
+ -http://edamontology.org/halvade_somatic(Tool1) empty(Out1.2)
+ ...
+
+The first constraint encodes that in case ``Artic`` (tool name is Artic, ``http://edamontology.org/`` is the ontology prefix) is used as the first tool (Tool1), the 3rd output of the first tool (Out1.2) is empty, because the tool ``Artic`` has only 2 outputs. The second constraint specifies that in case ``Comet``is used as the first tool, its first input (In0.0) must be of type ``Data`` and format ``TSV``, with just a default ``APE_label`` (data can be labelled to create more strict constraints). The third and fourth constraints specify that in case ``halvade_somatic`` is used (as the first tool), the 2nd (Out1.1) and 3rd (Out1.2) outputs of the first tool must be empty, because the tool ``halvade_somatic`` has only one output.
+
+The indexing of the tools and tool inputs and outputs is visualized in the following figure. Note that the element out\ :sub:`X`\ :sup:`Y` in the figure is encoded as ``OutX.Y`` in CNF:
+
+.. image:: SLTLx_structure.png
+
+The initial ``Out`` data instances (``Out0.0``, ... , ``Out0.k-1``) are the workflow inputs (can be seen as "outputs of the environment") and the last ``In`` data instances (``In0.0``, ... , ``In0.k-1``) are the workflow outputs (can be seen as "inputs used by the following environment"). Indexes start from 0. The latest implementation (APE v2) was described in the Chapter 4 of `PhD thesis (V. Kasalica) `_.
\ No newline at end of file
diff --git a/docs/basics/install.rst b/docs/developers/install.rst
similarity index 98%
rename from docs/basics/install.rst
rename to docs/developers/install.rst
index 02cfad0..8128559 100644
--- a/docs/basics/install.rst
+++ b/docs/developers/install.rst
@@ -1,5 +1,5 @@
-Install
-=======
+Install Java API
+================
Requirements
^^^^^^^^^^^^^^
diff --git a/docs/specifications/java.rst b/docs/developers/java.rst
similarity index 97%
rename from docs/specifications/java.rst
rename to docs/developers/java.rst
index 4d323b2..b48f168 100644
--- a/docs/specifications/java.rst
+++ b/docs/developers/java.rst
@@ -4,10 +4,10 @@ APE as a Java Library
Run APE from a Java environment
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-Like the CLI, the APE API relies on a configuration file or object that references
+The APE library relies on a configuration file or object that references
the domain ontology, tool annotations, workflow specification and execution
parameters. All the parameters can either be set by a JSONObject/JSON file or
-be set programmatically.
+be set programmatically. The guidelines how to create domain configuration (and annotation) were provided on the `previous page `_.
APE API functions
^^^^^^^^^^^^^^^^^
@@ -103,8 +103,9 @@ Run the Synthesis
// write the solutions to the file system
APE.writeSolutionToFile(solutions); // write solutions to ./sat_solutions.txt
- APE.writeDataFlowGraphs(solutions, Rank.RankDir.TOP_TO_BOTTOM); // save images to ./Figures/
+ APE.writeDataFlowGraphs(solutions, Rank.RankDir.TOP_TO_BOTTOM); // save images to ./Figures/, alternatively APE.writeTavernaDesignGraphs() method can be used to generate workflows that follow the visual design of Taverna workflows
APE.writeExecutableWorkflows(solutions); // save scripts to ./Executables/
+ APE.writeCWLWorkflows(solutions); // save CWL files to ./CWL/
The API allows to generate and edit the configuration file programmatically between runs:
diff --git a/docs/restful-ape/introduction.rst b/docs/restful-ape/introduction.rst
index c6611d3..d53da56 100644
--- a/docs/restful-ape/introduction.rst
+++ b/docs/restful-ape/introduction.rst
@@ -3,10 +3,10 @@
-Introduction to RESTful APE
-===========================
+Introduction to APE's RESTful API
+=================================
-RESTful APE (see on `GitHub `_) is a **RESTful API** for the **APE** library (see `documentation`_). It offers an interface for users to engage with APE's automated pipeline exploration capabilities through HTTP requests. APE, both a command line tool and a Java API, facilitates the automatic exploration of computational pipelines from extensive collections of computational tools.
+`RESTful APE `_ is a **RESTful API** for the **APE** library (see `documentation `_). It offers an interface for users to engage with APE's automated pipeline exploration capabilities through HTTP requests. APE, both a command line tool and a Java API, facilitates the automatic exploration of computational pipelines from extensive collections of computational tools.
With the RESTful API, users can submit requests to the APE server for pipeline exploration, receiving results in standardized formats like JSON or XML. This API can be accessed through a web browser or any HTTP client, allowing for integration into various applications for streamlined pipeline exploration.
@@ -14,8 +14,9 @@ With the RESTful API, users can submit requests to the APE server for pipeline e
**Key Features:**
- **Automated Pipeline Exploration:** Automate the exploration of computational pipelines from a broad collection of tools.
-- **Standardized Results:** Receive exploration results in JSON or XML formats.
+- **Standardized Results:** Receive exploration results in JSON format.
- **Flexible Integration:** Use APE through a web browser or any HTTP client, enabling easy integration into other applications.
+- **Docker Support:** Run RESTful APE as a Docker container for easy deployment and management (see `Docker image `_).
This makes the RESTful API for APE a potent and adaptable tool for enhancing scientific workflows with APE's capabilities.
diff --git a/docs/restful-ape/restful-api.rst b/docs/restful-ape/restful-api.rst
index 288d208..e7163fa 100644
--- a/docs/restful-ape/restful-api.rst
+++ b/docs/restful-ape/restful-api.rst
@@ -1,5 +1,5 @@
-Overview of RESTful APE API
-=======================================
+RESTful API Methods
+===================
This section offers a brief overview of the RESTful APE API, designed to facilitate automated pipeline exploration through HTTP requests. It serves as an initial sketch to familiarize users with the API's core functionalities and endpoints.
diff --git a/docs/specifications/constraints.rst b/docs/specifications/constraints.rst
new file mode 100644
index 0000000..5774611
--- /dev/null
+++ b/docs/specifications/constraints.rst
@@ -0,0 +1,258 @@
+Describe desired workflows
+==========================
+
+
+This page describes how to specify the workflows to be generated by the APE library. The workflows are specified using a JSON configuration file as described in in the previous section. The initial description of the problem is given by the workflow inputs and outputs described in the configuration file. In addition, the configuration file references a constraint file that specifies the constraints that the workflow must satisfy. The constraints are used to guide the workflow synthesis process and ensure that the generated workflows meet the requirements of the problem. This page describes how to specify the constraints in the constraint file.
+
+
+
+
+Constraint file is a JSON document that contains a list of constraints that must be satisfied by each generated workflow. In case constraints are contradicting each other (e.g., "Use tool X" and "Never use tool X") so solution will be found. The constraints can be specified in two ways: using predefined templates or using SLTL\ :sup:`x` (Semantic Linear Time Temporal Logic extended) constraints. The predefined templates are a set of common constraints that can be used to specify the relationships between the operations in the workflow. The SLTL\ :sup:`x` constraints are more flexible and allow the user to define complex constraints using logical formulas. The following sections describe how to specify constraints using predefined templates and SLTL\ :sup:`x` constraints.
+
+Constraint Templates
+--------------------
+
+As an example we will present one of the constraint templates, namely "if then generate type" is represented as follows:
+
+.. code-block:: json
+
+ {
+ "constraintid": "use_m",
+ "description": "Use the specified tool in the solution.",
+ "parameters": [
+ ["${parameter_1}"]
+ ]
+ }
+
+``"${parameter_1}"`` represent a tool or an abstract operation from the domain taxonomy. The following encoding represents a use of such constraint in practice (tag ``"description"`` is not obligatory):
+
+.. code-block:: json
+
+ {
+ "constraintid": "gen_ite_t",
+ "parameters": [
+ ["Transformation"]
+ ]
+ }
+
+The constraint is interpreted as:
+"A tool that performs Transformation must be used in the workflow."
+
+All pre-defined constraints that can be used:
+
+==================== ===========
+ID Description
+==================== ===========
+``ite_m`` If we use operation ``${parameter_1}``,
+
+ then use ``${parameter_2}`` subsequently.
+-------------------- -----------
+``itn_m`` If we use operation ``${parameter_1}``,
+
+ then do not use ``${parameter_2}`` subsequently.
+-------------------- -----------
+``depend_m`` If we use operation ``${parameter_1}``,
+
+ then we must have used ``${parameter_2}`` prior to it.
+-------------------- -----------
+``next_m`` If we use operation ``${parameter_1}``,
+
+ then use ``${parameter_2}`` as a next operation in the sequence.
+-------------------- -----------
+``prev_m`` If we use operation ``${parameter_1}``,
+
+ then we must have used ``${parameter_2}`` as a previous operation in the sequence.
+-------------------- -----------
+``use_m`` Use operation ``${parameter_1}`` in the solution.
+-------------------- -----------
+``nuse_m`` Do not use operation ``${parameter_1}`` in the solution.
+-------------------- -----------
+``last_m`` Use ``${parameter_1}`` as last operation in the solution.
+-------------------- -----------
+``use_t`` Use type ``${parameter_1}`` in the solution.
+-------------------- -----------
+``gen_t`` Generate type ``${parameter_1}`` in the solution.
+-------------------- -----------
+``nuse_t`` Do not use type ``${parameter_1}`` in the solution.
+-------------------- -----------
+``ngen_t`` Do not generate type ``${parameter_1}`` in the solution.
+-------------------- -----------
+``use_ite_t`` If we have used data type ``${parameter_1}``,
+
+ then use type ``${parameter_2}`` subsequently.
+-------------------- -----------
+``gen_ite_t`` If we have generated data type ``${parameter_1}``,
+
+ then generate type ``${parameter_2}`` subsequently.
+-------------------- -----------
+``use_itn_t`` If we have used data type ``${parameter_1}``,
+
+ then do not use type ``${parameter_2}`` subsequently.
+-------------------- -----------
+``gen_itn_t`` If we have generated data type ``${parameter_1}``,
+
+ then do not generate type ``${parameter_2}`` subsequently.
+-------------------- -----------
+``operation_input`` Use the operation with an input of the given type.
+-------------------- -----------
+``operation_output`` Use the operation to generate an output of the given type.
+-------------------- -----------
+``connected_op`` The 1st operation should generate an output used bt the 2nd operation.
+-------------------- -----------
+``not_connected_op`` The 1st operation should never generate an output sued by the 2nd operation.
+-------------------- -----------
+``not_repeat_op`` No operation that belongs to the subtree should be repeated within the workflow.
+==================== ===========
+
+.. _sltlx-constraints:
+SLTL\ :sup:`x` constraints
+--------------------------
+
+SLTL\ :sup:`x` (Semantic Linear Time Temporal Logic extended) allows the user to define constraints using logical formulas.
+For example, the following constraint prevents an operation within the subtree ``Transformation`` from using the same input twice:
+
+.. code-block:: json
+
+ {
+ "constraintid": "SLTLx",
+ "formula": "!F Exists (?x) (<'Transformation'(?x,?x;)> true)"
+ }
+
+The formula above can be broken down as follows:
+
+- ``!``: negation operator
+- ``F``: Finally operator - the formula holds at some point in the future (future in Time Logics can be seen as the following states in the workflow)
+- ``Exists``: Existential quantifier
+- ``?x``: variable
+- true : after applying operation ``Transformation`` with at least 2 distinct inputs ``?x`` and ``?x`` we reach a state where the formula holds (true). Notice that the semicolon ``;`` is used to separate the inputs and outputs of the operation. In our specification no outputs were specified.
+
+The formula above can be interpreted as: "It is not the case that in the workflow there exists an operation ``Transformation`` that uses the same input twice."
+
+This second example specifies a constraint which makes sure a workflow input is used only once.
+To tell APE which inputs are not to be used twice, the workflow inputs have been labeled as "Input" in the run configuration file:
+
+.. code-block:: json
+
+ "inputs": [
+ {
+ "data_0006": ["data_9003"],
+ "format_1915": ["format_3989"],
+ "APE_label": ["Input"]
+ },
+ {
+ "data_0006": ["data_9003"],
+ "format_1915": ["format_3989"],
+ "APE_label": ["Input"]
+ },
+ {
+ "data_0006": ["data_9001"],
+ "format_1915": ["format_1929", "format_3331"],
+ "APE_label": ["Input"]
+ }
+ ],
+
+The labeled inputs can now be used in the SLTL\ :sup:`x` formula:
+
+.. code-block:: json
+
+ {
+ "constraintid": "SLTLx",
+ "formula": "! Exists (?x) ('Input'(?x) & (F <'Tool'(?x;)> F <'Tool'(?x;)> true))"
+ }
+
+In our example ``Tool`` is the root of the tool taxonomy, therefore it's the most general type of operation. The formula above can be broken down as follows:
+
+- ``!``: negation operator
+- ``Exists``: Existential quantifier
+- ``?x``: variable
+- ``'Input'(?x)``: the variable ``?x`` is of type ``Input``
+- ``&``: logical AND operator
+- ``F``: Finally operator - the formula holds at some point in the future (future in Time Logics can be seen as the following states in the workflow)
+- ``<'Tool'(?x;)> X``: after applying operation ``Tool`` with input ``?x`` we reach a state where the formula ``X`` holds (in our case, the formula ``X`` is ``F <'Tool'(?x;)> true``)
+- ``F <'Tool'(?x;)> true``: at some point in the future, the operation ``Tool`` with input ``?x`` is applied and the formula holds (true)
+
+The formula above can be interpreted as: "It is not the case that in the workflow there exists an input that is used twice."
+
+SLTL\ :sup:`x` syntax
+---------------------
+
+This document provides a list of syntax options available in the SLTL\ :sup:`x` logic.
+
+Formulas (``formula``)
+""""""""""""""""""""""
+
+A ``formula`` can be one of the following:
+
+1. ``true``
+2. ``( formula )``
+3. ``< TOOL > formula``
+4. ``CONSTANT ( VARIABLE )``
+5. ``VARIABLE = VARIABLE``
+6. ``! formula``
+7. ``Forall ( VARIABLE ) formula``
+8. ``Exists ( VARIABLE ) formula``
+9. ``UN_MODAL formula``
+10. ``formula BIN_CONNECTIVE formula``
+11. ``formula BIN_MODAL formula``
+12. ``R ( VARIABLE , VARIABLE )``
+
+Binary Connectives (``BIN_CONNECTIVE``)
+"""""""""""""""""""""""""""""""""""""""
+
+- ``&`` (AND)
+- ``|`` (OR)
+- ``->`` (IMPL)
+- ``<->`` (EQUIVALENT)
+
+Unary Modal Operators (``UN_MODAL``)
+""""""""""""""""""""""""""""""""""""
+
+- ``G`` (GLOBALLY)
+- ``F`` (FINALLY)
+- ``X`` (NEXT STEP)
+
+Binary Modal Operators (``BIN_MODAL``)
+""""""""""""""""""""""""""""""""""""""
+
+- ``U`` (``SLTL_UNTIL``)
+
+Tool (``TOOL``)
+"""""""""""""""
+
+A ``TOOL`` is defined as:
+
+``CONSTANT ( VARIABLE,...,VARIABLE ; VARIABLE,...,VARIABLE )``
+
+
+Variables (``VARIABLE``)
+""""""""""""""""""""""""
+
+- A variable is denoted by a ``?`` followed by alphanumeric characters or underscores.
+
+Tokens
+""""""
+
+- ``true``: The constant true.
+- ``VARIABLE``: A variable starting with ``?``.
+- ``CONSTANT``: A constant enclosed in single quotes ``'``.
+- ``R``: Relation.
+- ``U``: Until.
+- ``G``: Globally.
+- ``F``: Finally.
+- ``X``: Next.
+- ``|``: OR.
+- ``&``: AND.
+- ``->``: Implies.
+- ``<->``: Equivalent.
+- ``=``: Equals.
+- ``!``: Negation.
+- ``Exists``: Existential quantifier.
+- ``Forall``: Universal quantifier.
+
+Examples
+""""""""
+
+1. ``!F Exists (?x) (<'Transformation'(?x,?x;)> true)`` - No operation within the subtree ``Transformation`` uses the same input twice.
+2. ``! Exists (?x) ('Input'(?x) & (F <'Tool'(?x;)> F <'Tool'(?x;)> true))`` - No ``Input`` is used twice (where ``Input`` is a custom label added to all the inputs).
+
+See `SLTL\ :sup:`x` Constraints <#sltlx-constraints>`_ for more detailed explanation of the constraints.
diff --git a/docs/specifications/setup.rst b/docs/specifications/domain.rst
similarity index 61%
rename from docs/specifications/setup.rst
rename to docs/specifications/domain.rst
index fa4d3b1..b6ff85b 100644
--- a/docs/specifications/setup.rst
+++ b/docs/specifications/domain.rst
@@ -1,12 +1,10 @@
-APE Setup
-=========
+Annotate your domain
+===========================
Configuration file
^^^^^^^^^^^^^^^^^^
-In order to run APE from the command line, a (JSON) configuration file needs to be provided.
-The API requires a configuration object, which could be created programmatically
-or could also be created from a (JSON) configuration file.
+In order to run APE a configuration needs to be provided. The configuration can be provided in a JSON file, or programmatically (when using APE as a java library).
The file provides references to all required information, that can be classified in the following 3 groups:
1. `Domain model `_ - classification of the types and operations in the domain in form
@@ -61,8 +59,6 @@ The core configuration is structured as follows:
| | | where the inheritance has to be strictly specified, false if we |
| | | should consider all possible data traces (gives more solutions). |
+---------------------------------+----------+------------------------------------------------------------------+
-| ``cwl_annotations_path`` | No | Path to the YAML file that contains CWL tool annotations. |
-+---------------------------------+----------+------------------------------------------------------------------+
JSON example:
@@ -74,7 +70,6 @@ JSON example:
"toolsTaxonomyRoot": "ToolsTaxonomy",
"dataDimensionsTaxonomyRoots": ["TypesTaxonomy"],
"tool_annotations_path": "./GeoGMT/tool_annotations.json",
- "cwl_annotations_path": "./GeoGTM/cwl_annotations.yaml",
}
@@ -355,330 +350,31 @@ or the output of a previous tool.
CWL Annotations
^^^^^^^^^^^^^^^^^^
-The CWL annotations file specifies the the CWL code related to each tool
-to allow APE to generate executable CWL workflow files.
-
-Structure
-~~~~~~~~~
-
-The file has the following structure:
-
-.. code-block:: shell
-
- +ID:
- inputs:
- +input_definition
- ?implementation:
- code
-
-where (+) requires 1 or more, (?) requires 0 or 1, and no sign requires existence of exactly 1 such tag.
-
-+------------------+----------------------------------------------------------------------------------------------------+
-| Tag | Description |
-+==================+====================================================================================================+
-| ID | unique identifier of the tool |
-+------------------+----------------------------------------------------------------------------------------------------+
-| input_definition | CWL `WorkflowInputParameter `_ |
-+------------------+----------------------------------------------------------------------------------------------------+
-| code | CWL `WorkflowStep `_ |
-+------------------+----------------------------------------------------------------------------------------------------+
-
-Example
-~~~~~~~
-
-The following example annotates the tool ``black_white``,
-which takes any ``Image`` (Type) of any Format and outputs a grayscale image.
-As a regular shell command, it would look like this:
-
-.. code-block:: shell
-
- convert $input0 -colorspace Gray out.png
-
-This is the CWL annotation representing the command:
-
-.. code-block:: yaml
-
- black_white:
- inputs:
- - \@image\@: File
- implementation:
- black_white:
- in:
- image: \@input[0]
- out: [image_out]
- run:
- class: CommandLineTool
- baseCommand: convert
- arguments:
- - valueFrom: -colorspace Gray
- position: 1
- shellQuote: False
- - valueFrom: out.png
- position: 2
- inputs:
- image:
- type: File
- inputBinding:
- position: 0
- outputs:
- image_out:
- type: File
- outputBinding:
- glob: out.png
-
-Note that each input name should be surrounded by ``\@`` to tell APE this is the name.
-APE will generate unique names for the step inputs in the workflow and link the workflow inputs.
-
-Multiple steps in one tool
-""""""""""""""""""""""""""
-
-If you want to perform multiple steps in one tool,
-you can simply define multiple CWL steps in the implementation section of the annotation.
-For example, like the ``add_small_border`` tool:
-
-.. code-block:: shell
-
- height=$(($(identify -format '%h' $input0)/20))
- convert $input0 -bordercolor $input1 -border $height out.png
-
-This tool first calculates the height of the image in the step ``calc_height``,
-and then uses it to set the size of the border it gives to the image in step ``add_small_border``.
-``$input0`` represents the input image, and ``$input1`` represents the color of the border.
-
-.. code-block:: yaml
-
- add_small_border:
- inputs:
- - \@image\@: File
- - \@color\@: string
- implementation:
- # Step 1
- calc_height:
- in:
- image: \@input[0]
- out: [height]
- run:
- class: CommandLineTool
- baseCommand: identify
- stdout: out
- inputs:
- image:
- type: File
- inputBinding:
- position: 0
- prefix: -format '%h'
- shellQuote: False
- outputs:
- height:
- type: int
- outputBinding:
- glob: out
- loadContents: true
- outputEval: $(self[0].contents / 20)
- # Step 2
- add_small_border:
- in:
- image: \@input[0]
- color: \@input[1]
- height: calc_height/height
- out: [image_out]
- run:
- class: CommandLineTool
- baseCommand: convert
- arguments:
- - valueFrom: out.png
- position: 3
- inputs:
- image:
- type: File
- inputBinding:
- position: 0
- color:
- type: string
- inputBinding:
- position: 1
- prefix: -bordercolor
- height:
- type: int
- inputBinding:
- position: 2
- prefix: -border
- outputs:
- image_out:
- type: File
- outputBinding:
- glob: out.png
-
-Note that each input is numbered. Because the ``image`` input is listed first and ``color`` second,
-they are represented by ``\@input[0]`` and ``\input[1]`` respectively.
-It is important these inputs are placed in the same order as the inputs in the tool annotations file.
-
-Also note that you can put the ``\@input`` bindings wherever you want, and as many times as you want.
-APE will automatically fill them in later.
-
-Additional workflow input parameters
-""""""""""""""""""""""""""""""""""""
-
-Sometimes tools might only want to read some input parameter.
-To implement such a tool in the CWL annotations, add an annotation which does not have an implementation.
-For example, in ImageMagick there is a tool ``generate_color``.
-This tool only reads a color name given by the user, which can be used by other tools later.
-
-.. code-block:: yaml
-
- generate_color:
- inputs:
- - \@color\@:
- type: string
- default: Cyan
-
-Constraints File
-^^^^^^^^^^^^^^^^
-
-As an example we will present one of the constraint templates, namely "if then generate type" is represented as follows:
-
-.. code-block:: json
-
- {
- "constraintid": "gen_ite_t",
- "description": "If we have generated data type ``${parameter_1}``,
- then generate type ``${parameter_2}`` subsequently.",
- "parameters": [
- ["${parameter_1}"],
- ["${parameter_2}"]
- ]
- }
-
-where both ``"${parameter_1}"`` and ``"${parameter_2}"`` represent a sequence of one or more data terms. The following encoding represents a use of such constraint in practice (tag ``"description"`` is not obligatory):
+The CWL annotations file specifies the the CWL code related to each tool to allow APE to generate executable CWL workflow files. Instead of providing the explicit commands within the domain annotations file, the user is expected to provide a path to the CWL file that contains the CWL code for the tool. The path can be a local path or a URL. An example of a tool annotation with a CWL reference is shown below:
.. code-block:: json
{
- "constraintid": "gen_ite_t",
- "parameters": [
- ["article","docx"],
- ["article","pdf"]
- ]
- }
-
-The constraint is interpreted as:
-"If an **article** in **docx** format was generated, then an **article** in **pdf** format has to be generated subsequently."
-
-All pre-defined constraints that can be used:
-
-==================== ===========
-ID Description
-==================== ===========
-``ite_m`` If we use module ``${parameter_1}``,
-
- then use ``${parameter_2}`` subsequently.
--------------------- -----------
-``itn_m`` If we use module ``${parameter_1}``,
-
- then do not use ``${parameter_2}`` subsequently.
--------------------- -----------
-``depend_m`` If we use module ``${parameter_1}``,
-
- then we must have used ``${parameter_2}`` prior to it.
--------------------- -----------
-``next_m`` If we use module ``${parameter_1}``,
-
- then use ``${parameter_2}`` as a next module in the sequence.
--------------------- -----------
-``prev_m`` If we use module ``${parameter_1}``,
-
- then we must have used ``${parameter_2}`` as a previous module in the sequence.
--------------------- -----------
-``use_m`` Use module ``${parameter_1}`` in the solution.
--------------------- -----------
-``nuse_m`` Do not use module ``${parameter_1}`` in the solution.
--------------------- -----------
-``last_m`` Use ``${parameter_1}`` as last module in the solution.
--------------------- -----------
-``use_t`` Use type ``${parameter_1}`` in the solution.
--------------------- -----------
-``gen_t`` Generate type ``${parameter_1}`` in the solution.
--------------------- -----------
-``nuse_t`` Do not use type ``${parameter_1}`` in the solution.
--------------------- -----------
-``ngen_t`` Do not generate type ``${parameter_1}`` in the solution.
--------------------- -----------
-``use_ite_t`` If we have used data type ``${parameter_1}``,
-
- then use type ``${parameter_2}`` subsequently.
--------------------- -----------
-``gen_ite_t`` If we have generated data type ``${parameter_1}``,
-
- then generate type ``${parameter_2}`` subsequently.
--------------------- -----------
-``use_itn_t`` If we have used data type ``${parameter_1}``,
-
- then do not use type ``${parameter_2}`` subsequently.
--------------------- -----------
-``gen_itn_t`` If we have generated data type ``${parameter_1}``,
-
- then do not generate type ``${parameter_2}`` subsequently.
--------------------- -----------
-``operation_input`` Use the operation with an input of the given type.
--------------------- -----------
-``operation_output`` Use the operation to generate an output of the given type.
--------------------- -----------
-``connected_op`` The 1st operation should generate an output used bt the 2nd operation.
--------------------- -----------
-``not_connected_op`` The 1st operation should never generate an output sued by the 2nd operation.
--------------------- -----------
-``not_repeat_op`` No operation that belongs to the subtree should be repeated over.
-==================== ===========
-
-SLTLx constraints
-~~~~~~~~~~~~~~~~~
-
-SLTLx (Semantic Linear Time Temporal Logic extended) allows the user to define constraints using logical formulas.
-For example, the following constraint prevents an operation within the subtree ``operation_0004`` from using the same input twice:
-
-.. code-block:: json
-
- {
- "constraintid": "SLTLx",
- "formula": "!F Exists (?x1) (<'operation_0004'(?x1,?x1;)> true)"
- }
-
-TODO: breakdown of formula above.
-
-This second example specifies a constraint which makes sure a workflow input is used only once.
-To tell APE which inputs are not to be used twice, the workflow inputs have been labeled as "Input" in the run configuration file:
-
-.. code-block:: json
-
- "inputs": [
- {
- "data_0006": ["data_9003"],
- "format_1915": ["format_3989"],
- "APE_label": ["Input"]
- },
- {
- "data_0006": ["data_9003"],
- "format_1915": ["format_3989"],
- "APE_label": ["Input"]
- },
- {
- "data_0006": ["data_9001"],
- "format_1915": ["format_1929", "format_3331"],
- "APE_label": ["Input"]
- }
- ],
-
-The labeled inputs can now be used in the SLTLx formula:
-
-.. code-block:: json
-
- {
- "constraintid": "SLTLx",
- "formula": "! Exists (?x) ('Input'(?x) & (F <'operation_0004'(?x;)> F <'operation_0004'(?x;)> true))"
+ "outputs": [
+ {
+ "format_1915": ["http://edamontology.org/format_2330"],
+ "data_0006": ["http://edamontology.org/data_2872"]
+ }
+ ],
+ "inputs": [
+ {
+ "format_1915": ["http://edamontology.org/format_3747"],
+ "data_0006": ["http://edamontology.org/data_0945"]
+ }
+ ],
+ "taxonomyOperations": [
+ "http://edamontology.org/operation_3434", "http://edamontology.org/operation_0335"
+ ],
+
+ "implementation" : { "cwl_reference" : "https://raw.githubusercontent.com/Workflomics/containers/main/cwl/tools/protXml2IdList/protXml2IdList.cwl"} ,
+ "label": "protXml2IdList",
+ "id": "protXml2IdList"
}
-TODO: breakdown of formula above.
-
-SLTLx syntax
-""""""""""""
-TODO: talk about SLTLx syntax.
+The example illustrates a tool called ``protXml2IdList`` that takes an input of type ``data_0945`` and format ``format_3747`` and outputs data of type ``data_2872`` and format ``format_2330``.
+ Notice that the data instances in the example are defined by a pair, data type (`data_0006`) and format (`format_1915`). The operations, data types and formats are referenced by their (`EDAM `_) ontology URIs.
diff --git a/index.rst b/index.rst
index 1dd1354..d5ffc7d 100644
--- a/index.rst
+++ b/index.rst
@@ -1,7 +1,7 @@
Welcome to APE’s documentation!
===============================
-APE (Automated Pipeline Explorer) is a library (available as CLI, Java API and a RESTful API) for the automated exploration of possible computational pipelines (scientific workflows) from large collections of computational tools. It also comes with a RESTful API interface. APE was originally developed at Utrecht University and is now maintained by the Netherlands eScience Center.
+`APE (Automated Pipeline Explorer) `_ is a library (available as CLI, Java API and a RESTful API) for the automated exploration of possible computational pipelines (scientific workflows) from large collections of computational tools. It also comes with a RESTful API interface. APE was originally developed at Utrecht University and is now maintained by the Netherlands eScience Center.
@@ -36,28 +36,26 @@ Contents
.. toctree::
:maxdepth: 2
- :caption: Basics
+ :caption: Getting Started
docs/basics/ape-introduction
docs/basics/gettingstarted
- docs/basics/install
.. toctree::
:maxdepth: 2
- :caption: APE Specifications
+ :caption: Configure APE
- docs/specifications/setup
- docs/specifications/java
- docs/specifications/cli
+ docs/specifications/domain
+ docs/specifications/constraints
.. toctree::
:maxdepth: 2
- :caption: Use cases and Demos
+ :caption: Using APE
- docs/demo/demo-overview
- docs/demo/imagemagick/imagemagick
- docs/demo/geo_gmt/geo_gmt
- docs/demo/massspectrometry/massspectrometry
+ docs/developers/install
+ docs/developers/java
+ docs/developers/cli
+ docs/developers/developers
.. toctree::
:maxdepth: 2
@@ -67,6 +65,16 @@ Contents
docs/restful-ape/gettingstarted
docs/restful-ape/restful-api
+
+.. toctree::
+ :maxdepth: 2
+ :caption: Use cases and Demos
+
+ docs/demo/demo-overview
+ docs/demo/imagemagick/imagemagick
+ docs/demo/geo_gmt/geo_gmt
+ docs/demo/massspectrometry/massspectrometry
+
.. toctree::
:maxdepth: 2
:caption: APE Web