Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs #3

Open
wants to merge 7 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions docs/source/developer/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,19 @@ The following pages are for those wanting to develop the render service.

render
queue


Overview
=============================================================================

Both the CS Unplugged and CS Field Guide projects include resources which need to be generated on the fly.
These range from converting a page into a printer-friendly state, to generating unique classroom worksheets using a base template.
To avoid putting uneccessary load on the client, we do this using our "Render Service".

This system is split into two parts; the ``queueservice`` and the ``renderservice`` itself.

The `queueservice` is essentially a temporary hack while we wait for the Google Task Queue v2 to be released, assuming it is suitable to run it locally.
It is responsible for recieving tasks from a client and sending them to the ``renderservice`` while developing locally.

The ``renderservice`` is the component that is actually responsible for generating resources based on tasks it recieves from the task queue.
The ``renderserice`` also contains the logic for the resources used in the CS Unplugged project.
46 changes: 25 additions & 21 deletions docs/source/developer/queue.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,46 +3,50 @@ Queue Service

The queue service is an image that is run during local development only, it provides a rough implementation of the Google TaskQueue RESTful API for access by the render service and external task producers.

This should disappear with `Task Queue v2 <https://cloud.google.com/appengine/docs/standard/python/taskqueue/rest/migrating-from-restapi-v1>`_ assuming there is a good way to run it locally.
This is just a placeholder and should disappear with `Task Queue v2 <https://cloud.google.com/appengine/docs/standard/python/taskqueue/rest/migrating-from-restapi-v1>`_ assuming there is a suitable way to run it locally.

Infrastructure
==============================================================================

When running locally using the *docker-compose* environment the Google Task Queue component is replaced with 2 other components, a Redis instance and a Queue service.
When running locally using the docker-compose environment the Google Task Queue component is replaced with 2 other components, a Redis instance and a Queue service.

Queue Service
------------------------------------------------------------------------------

The Queue Sevice mimics the `Google Task Queue REST API <https://cloud.google.com/appengine/docs/standard/python/taskqueue/rest/>`_ allowing for a local task queue to be created using the Redis instance.
The Queue Service mimics the `Google Task Queue REST API <https://cloud.google.com/appengine/docs/standard/python/taskqueue/rest/>`_ allowing for a local task queue to be created using the Redis instance.

Important files:

.. code-block:: none

queueservice/
├── api_data/
| ├── __init__.py
| ├── taskqueue_v1beta2.py
| └── taskqueue_v1beta2.api
├── Dockerfile
├── gunicorn.conf.py
├── requirements.txt
├── webserver.py
└── wsgi.py


render
├── dev/
│ └── queue_client.py
└──queueservice/
├── api_data/
│ ├── taskqueue_v1beta2.py
│ └── taskqueue_v1beta2.api
├── Dockerfile-local
├── gunicorn.conf.py
├── requirements.txt
├── webserver.py
└── wsgi.py


- ``queue_client.py``: A basic API for use in local development to interact with the Queue Service.
- ``api_data/``: Contains pairs of API specifications and Python Implementation.

+ ``taskqueue_v1beta2.py``: The python implementation of the taskqueue api for version 1beta2.
+ ``taskqueue_v1beta2.api``: Google API description of the taskqueue REST API from the Google Discovery Service. This file has been modified to remove authorization scoping.
- ``taskqueue_v1beta2.py``: The python implementation of the taskqueue api for version 1beta2.
- ``taskqueue_v1beta2.api``: Google API description of the taskqueue REST API from the Google Discovery Service. This file has been modified to remove authorization scoping.

- ``Dockerfile``: Dockerfile for building the webservice.
- ``Dockerfile-local``: Dockerfile for building the webservice.
- ``gunicorn.conf.py``: Gunicorn configuration.
- ``requirements.txt``: Specifies required python modules needed to run the webservice.
- ``webserver.py``: Basic discovery webservice which allows for the loading of custom REST APIs.
- ``webserver.py``: Basic Flask app working as a discovery webservice which allows for the loading of custom REST APIs.
- ``wsgi.py``: Gunicorn + Docker entrypoint for the Queue service.

When using the Queue service it is important to note:

.. note::

- We do not expect this component to be changed much, and it is likely to be replaced in future by `Google Cloud Tasks <https://cloud.google.com/appengine/docs/flexible/python/migrating>`_.
- It is not a one-to-one mapping of the Google Task Queue REST API as it does not include ``GET`` on a specific Task Queue.
Expand All @@ -53,6 +57,6 @@ When using the Queue service it is important to note:
Redis Instance
------------------------------------------------------------------------------

The REDIS service is currently only used by the Queue service as a datastore for tasks and handling the queuing of tasks. For those who with no knowledge of REDIS should consider it a 'high performance, in-memory database that is a glorified dictionary' for simplicity.
The REDIS service is currently only used by the Queue service as a datastore for tasks and handling the queuing of tasks. For those who are not familiar with REDIS, consider it a 'high performance, in-memory database that is a glorified dictionary' for simplicity.

For information on working with REDIS see the `REDIS documentation <https://redis.io/commands>`_.
132 changes: 69 additions & 63 deletions docs/source/developer/render.rst
Original file line number Diff line number Diff line change
@@ -1,19 +1,77 @@
Render Service
##############################################################################

The render service pipline flows as follows; It accesses an external queue to get a task, consumes the task creating a resource, saves the resource and then repeats this process. Input tasks that are consumed are tagged with :code:`task` and output is handled by tasks with the :code:`result` tag.
The render service pipline flows as follows; It accesses an external queue to get a task, consumes the task creating a resource, saves the resource and then repeats this process.

Infrastructure
==============================================================================

The render service consists of multiple processes on a single unit, this includes multiple daemons that consume tasks from an external queue and produce files, and a webserver performs health checks that monitor and restart the render daemons.

Important files:

.. code-block:: none

renderservice/
├── render/
│ ├── daemon/
│ ├── resources/
│ ├── tests/
│ ├── webserver/
│ └── __init__.py
├── scripts/
│ ├── docker-entrypoint.sh
│ ├── mount-bucket.sh
│ └── shutdown-script.sh
├── static/
├── templates/
├── Dockerfile
├── Dockerfile-local
└── requirements.txt


- ``render/``: The python render service package.

- ``daemon/``: Contains python classes pertaining to the daemon for consuming tasks and producing files.
- ``resources/``: Contains source files with custom logic for generating resources (pdf files).
- ``tests/``: Tests covering all the logic in the python render service package.
- ``webserver/``: Contains the webserver logic, including logic for health checks and daemon recovery.
- ``__init__.py``: Contains the version of the render service.

- ``scripts/``: Bash shell scripts used in the creation of the render service.

- ``docker-entrypoint.sh``: The entrypoint for the render service, creates multiple daemons and starts up the webservice.
- ``mount-bucket.sh``: Mounts the Google Cloud bucket using `gcsfuse <https://cloud.google.com/storage/docs/gcs-fuse>`_.
- ``shutdown-script.sh``: TODO: This is still to be used. A script which is run when the machine is pre-empted.

- ``static/``: Locally stored static files, either kept locally for speed or licence reasons (such as do not distribute).
- ``templates/``: Jinja templates for webpages and render service.
- ``Dockerfile``: Dockerfile for building the service.
- ``Dockerfile-local``: Dockerfile for building the service for local development.
- ``requirements.txt``: Specifies required python modules needed to run the webservice.

Some important things to note when working with the render service:

- When in local development the render service does not have a live volume of the renderservice directory, that means any changes require a rebuild of the service to see the changes.

- The render service has multiple directories for static files, a local copy and a mounted external copy. The static folder in the root directory of the repository is mounted as the external copy when run locally.



Task Definitions
==============================================================================

Tasks retrieved from the render queue must be a json dictionary. That is, using the json library python must be able to load the payload as a dictionary. This dictionary must also contain a :code:`kind` mapping which specifies what can be done with the message.
Tasks retrieved from the render queue must be a json dictionary.
That is, using the json library, python must be able to load the payload as a dictionary.
This dictionary must also contain a ``kind`` mapping which specifies what can be done with the message.

General Tasks
------------------------------------------------------------------------------

These tasks must be tagged with the :code:`task` string when added to the queue. The render service only consumes tasks that are tagged with :code:`task`.
These tasks must be tagged with the ``task`` string when added to the queue.
The render service only consumes tasks that are tagged with ``task``.

To render a resource you must use the :code:`render` task as defined below.
To render a resource you must use the ``render`` task as defined below.

.. code-block:: none

Expand All @@ -27,13 +85,15 @@ To render a resource you must use the :code:`render` task as defined below.
copies: 1
}

Where all of the above are required for every task, and for each resource additional values may be required based on their :code:`valid_options` function.
Where all of the above are required for every task, and for each resource additional values may be required based on their ``valid_options`` function.

The :code:`resource_view` determines the resource module to generate from, the :code:`resource_slug` and :code:`resource_name` are arbitary strings, the :code:`url` is preferably the url where the resource was generated from (including query), the :code:`header_text` is a string either an empty string or arbitary, and finally the :code:`copies` determines how many to generate.
The ``resource_view`` determines the resource module to generate from, the ``resource_slug`` and ``resource_name`` are arbitary strings, the ``url`` is preferably the url where the resource was generated from (including query), the ``header_text`` is either an empty or arbitary string, and finally the ``copies`` determines how many to generate.

Result Tasks
------------------------------------------------------------------------------
These tasks must be tagged with the :code:`result` string when added to the queue. The render service produces these tasks when a generate task has been completed to instruct other services where to find the output file.

These tasks must be tagged with the ``result`` string when added to the queue.
The render service produces these tasks when a generate task has been completed to instruct other services where to find the output file.

For documents that are small enough to be placed within the queue, the following task will be defined:

Expand All @@ -46,7 +106,7 @@ For documents that are small enough to be placed within the queue, the following
document: base64 string
}

Where :code:`success` is a boolean determining if the associated task was completed correctly, :code:`filename` is the filename of document, :code:`document` is a base64 encoded string of the document bytes.
Where ``success`` is a boolean determining if the associated task was completed correctly, ``filename`` is the filename of document, ``document`` is a base64 encoded string of the document bytes.

Another possible result is the is a document that is saved externally and a url can be used to access it, these tasks are defined as follows:

Expand All @@ -58,59 +118,5 @@ Another possible result is the is a document that is saved externally and a url
url: string
}

Where :code:`success` is a boolean determining if the associated task was completed correctly, and :code:`url` is the address to access the document.

Infrastructure
==============================================================================

The render service consists of multiple processes on a single unit, this includes multiple daemons that consume tasks from an external queue and produce files, and a webserver performs health checks that monitor and restart the render daemons.

Important files:

.. code-block:: none

renderservice/
├── render/
| ├── daemon/
| ├── resources/
| ├── tests/
| ├── webserver/
| └── __init__.py
├── scripts/
| ├── docker-entrypoint.sh
| ├── mount-bucket.sh
| ├── pip-install.sh
| └── shutdown-script.sh
├── static/
├── templates/
├── Dockerfile
├── Dockerfile-local
├── requirements.txt


- ``render/``: The python render service package.

+ ``daemon/``: Contains python classes pertaining to the daemon for consuming tasks and producing files.
+ ``resources/``: Contains source files with custom logic for generating resources (pdf files).
+ ``tests/``: Tests covering all the logic in the python render service package.
+ ``webserver/``: Contains the webserver logic, including logic for health checks and daemon recovery.
+ ``__init__.py``: Contains the version of the render service.

- ``scripts/``: Bash shell scripts used in the creation of the render service.

+ ``docker-entrypoint.sh``: The entrypoint for the render service, creates multiple daemons and starts up the webservice.
+ ``mount-bucket.sh``: Mounts the Google Cloud bucket using `gcsfuse <https://cloud.google.com/storage/docs/gcs-fuse>`_.
+ ``pip-install.sh``: Installs a pip requirements file in a specific order.
+ ``shutdown-script.sh``: TODO: This is still to be used. A script which is run when the machine is pre-empted.

- ``static/``: Locally stored static files, either kept locally for speed or licence reasons (such as do not distribute).
- ``templates/``: Jinja templates for webpages and render service.
- ``Dockerfile``: Dockerfile for building the service.
- ``Dockerfile-local``: Dockerfile for building the service for local development.
- ``requirements.txt``: Specifies required python modules needed to run the webservice.
Where ``success`` is a boolean determining if the associated task was completed correctly, and ``url`` is the address to access the document.

Some important things to note when working with the render service:

- When in local development the render service does not have a live volume of the renderservice directory, that mean any changes require a rebuild of the service to see the changes.

- The render service has multiple directories for static files, a local copy and a mounted external copy. The static folder in the root directory of the repository is mounted as the external copy when run locally.