bcgov · TimCsaky · Apr 19, 2024 · Apr 17, 2024 · Apr 18, 2024 · Apr 18, 2024
diff --git a/docs/Architecture-Hosted.md b/docs/Architecture-Hosted.md
@@ -0,0 +1,45 @@
+This page outlines the architecture and deployment features of the BC Gov Hosted COMS service. It is mainly intended for a technical audience, and for people who want to have a better understanding of how we have the service deployed.
+
+**Note:** For more details of the COMS application itself and how it works, see the [Architecture](Architecture) overview.
+
+## Table of Contents
+
+- [Infrastructure](#infrastructure)
+- [High Availability](#high-availability)
+- [Network Connectivity](#network-connectivity)
+- [Database connection Pooling](#database-connection-pooling)
+- [Horizontal Autoscaling](#horizontal-autoscaling)
+
+## Infrastructure
+
+The BC Govt. Hosted COMS service runs on the OpenShift container ecosystem. The following diagram provides a general logical overview of main component relations. Main network traffic flows are shown in fat arrows, while secondary network traffic relations are shown with a simple black line.
+
+![Hosted COMS Architecture](images/coms_architecture.png)
+
+**Figure 1 - The general infrastructure and network topology of the BC Govt. hosted COMS**
+
+### High Availability
+
+The COMS API and Database are all designed to be highly available within an OpenShift environment. The Database achieves high availability by leveraging [Patroni](https://patroni.readthedocs.io/en/latest/). COMS is designed to be a scalable and atomic microservice. On the OCP4 platform, there can be between 2 to 16 running replicas of the COMS microservice depending on service load. This allows the service to reliably handle a large variety of request volumes and scale resources appropriately.
+
+### Network Connectivity
+
+In general, all network traffic enters through the BC Govt. API Gateway. A specifically tailored Network Policy rule exists to allow only network traffic we expect to receive from the API Gateway. When a client connects to the COMS API, they will be going through OpenShift's router and load balancer before landing on the API gateway. That connection then gets forwarded to one of the COMS API pod replicas. Figure 1 represents the general network traffic direction with the outlined fat arrows. The direction of those arrows represents which component is initializing the TCP/IP connection.
+
+COMS uses a database network pool to maintain persistent database connections. Pooling allows the service to avoid the overhead of repeated TCP/IP 3-way handshakes to start a connection. By reusing existing connections in a network pool, we can pipeline and improve network efficiency. We pool connections from COMS to Patroni within our architecture. The OpenShift load balancer follows general default Kubernetes scheduling behavior.
+
+### Database connection Pooling
+
+We introduced network pooling for Patroni connections to mitigate network traffic overhead. As our volume of traffic increased, it became expensive to create and destroy network connections for each transaction. While low volumes of traffic are capable of operating without any notable delay to the user, we started encountering issues when scaling up and improving total transaction flow within COMS.
+
+By reusing connections whenever possible, we were able to avoid the TCP/IP 3-way handshake done on every new connection. Instead we could leverage existing connections to pipeline traffic and improve general efficiency. We observed up to an almost 3x performance increase in total transaction volume flow by switching to pooling.
+
+### Horizontal Autoscaling
+
+In order to make sure our application can horizontally scale (run many copies of itself), we had to ensure that all processes in the application are self-contained and atomic. Since we do not have any guarantees of which pod instance would be handling what task at any specific moment, the only thing we can do is to ensure that every unit of work is clearly defined and atomic so that we can prevent situations where there is deadlock, or double executions.
+
+While implementing Horizontal Autoscaling is relatively simple by using a [Horizontal Pod Autoscaler](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/) construct in OpenShift, we can only take advantage of it if the application is able to handle the different types of lifecycles. Based on usage metrics such as CPU and memory load, the HPA can increase or decrease the number of replicas on the platform in order to meet the demand.
+
+We found that in our testing, we were able to reliably scale up to around 17 pods before we began to crash out our Patroni database. While we haven't been able to reliably isolate the cause of this, we suspect that the underlying Postgres database can only handle up to 100 concurrent connections (and is thus ignoring Patroni's max connection limit of 500) or that the database containers are simply running out of memory before being able to handle more connections. As such, this is why we decided to cap our HPA to a maximum of 16 pods at this time.
+
+Our current limiting factor for scaling higher is the ability for our database to support more connections for some reason or another. If we get into the situation where we need to scale past 16 pods, we will need to consider more managed solutions for pooling db connections such as [PgBouncer](https://www.pgbouncer.org/).
diff --git a/docs/Architecture.md b/docs/Architecture.md
@@ -0,0 +1,57 @@
+This page outlines the general architecture and design principles of COMS. It is mainly intended for a technical audience, and for people who want to have a better understanding of how the system works.
+
+## Table of Contents
+
+- [Infrastructure](#infrastructure)
+- [Database Structure](#database-structure)
+- [Code Design](#code-design)
+
+## Infrastructure
+
+![COMS Architecture](images/coms_self_architecture.png)
+
+**Figure 1 - The general infrastructure and network topology of COMS**
+
+## Database Structure
+
+The PostgreSQL database is written and handled via managed, code-first migrations. We generally store tables containing users, objects, buckets, permissions, and how they relate to each other. As COMS is a back-end microservice, lines of business can leverage COMS without being tied to a specific framework or language. The following figures depict the database schema structure as of April 2023 for the v0.4.0 release.
+
+![COMS Public ERD](images/coms_erd_public.png)
+
+**Figure 3 - The public schema for a COMS database**
+
+Database design focuses on simplicity and succinctness. It effectively tracks the user, the object, the bucket, the permissions, and how they relate to each other. We enforce foreign key integrity by invoking onUpdate and onDelete cascades in Postgres. This ensures that we do not have dangling references when entries are removed from the system. Metadata and tags are represented as many-to-many relationships to maximize reverse search speed.
+
+![COMS Audit ERD](images/coms_erd_audit.png)
+
+**Figure 4 - The audit schema for a COMS database**
+
+We use a generic audit schema table to track any update and delete operations done on the database. This table is only modified by database via table triggers, and is not normally accessible by the COMS application itself. This should meet most general security, tracking and auditing requirements.
+
+## Code Design
+
+COMS is a relatively small and compact microservice with a very focused approach to handling and managing objects. However, not all design choices are self-evident just from inspecting the codebase. The following section will cover some of the main reasons why the code was designed the way it is.
+
+### Organization
+
+The code structure in COMS follows a simple, layered structure following best practice recommendations from Express, Node, and ES6 coding styles. The application has the following discrete layers:
+
+| Layer      | Purpose                                                                                      |
+| ---------- | -------------------------------------------------------------------------------------------- |
+| Controller | Contains controller express logic for determining what services to invoke and in what order  |
+| DB         | Contains the direct database table model definitions and typical modification queries        |
+| Middleware | Contains middleware functions for handling authentication, authorization and feature toggles |
+| Routes     | Contains defined Express routes for defining the COMS API shape and invokes controllers      |
+| Services   | Contains logic for interacting with either S3 or the Database for specific tasks             |
+| Validators | Contains logic which examines and enforces incoming request shapes and patterns              |
+
+Each layer is designed to focus on one specific aspect of business logic. Calls between layers are designed to be deliberate, scoped, and contained. This hopefully makes it easier to tell at a glance what each piece of code is doing and what it depends on. For example, the validation layer sits between the routes and controllers. It ensures that incoming network calls are properly formatted before proceeding with execution.
+
+#### Middleware
+
+COMS middleware focuses on ensuring that the appropriate business logic filters are applied as early as possible. Concerns such as feature toggles, authentication and authorization are handled here. Express executes middleware in the order of introduction. It will sequentially execute and then invoke the next callback as a part of its call stack. Because of this, we must ensure that the order we introduce and execute our middleware adhere to the following pattern:
+
+1. Run the `require*` middleware functions first (these generally invole the middleware found in `featureToggle.js`)
+2. Validation and structural cheks
+3. Permission and authorization checks
+4. Any remaining middleware hooks before invoking the controller
diff --git a/docs/Authentication.md b/docs/Authentication.md
@@ -0,0 +1,42 @@
+This page describes how to authenticate requests to the COMS API. The [Authentication Modes](Configuration.md#authentication-modes) must be enabled in the COMS configuration.
+
+**Note:** The BC Gov Hosted COMS service only allows OIDC Authentication using JWT's issued by the [Pathfinder SSO `standard` keycloak realm](https://github.com/bcgov/sso-keycloak/wiki#standard-service)).
+
+## OIDC Authentication
+
+With [OIDC mode](Configuration.md#oidc-keycloak) enabled, requests to the COMS API can be authenticated using a **User ID token** (JWT) issued by an OIDC authentication realm. The JWT should be added in an Authorization header (type `Bearer` token).
+
+COMS will only accept JWT's issued by one OIDC realm (specified in the COMS config). JWT's are typically issued to an application and saved to a user's browser when he/she signs-in to a website through the [Authorization Code Flow](https://openid.net/specs/openid-connect-core-1_0.html#CodeFlowAuth). Both the website (client app) and the instance of COMS must be [configured to use the same OIDC authentication realm](https://github.com/bcgov/common-object-management-service/blob/master/app/README.md#keycloak-variables) in order for the JWT to be valid.
+
+When COMS receives the request, it will validate the JWT (by calling the OIDC realm's token endpoint). The JWT is a reliable way of verifying the the user's identity on which the COMS permission model is based.
+
+The authentication when downloading an object also uses S3 pre-signed URLs:
+
+### Authentication flow for readObject
+
+Reference: [API Specification](https://coms.api.gov.bc.ca/api/v1/docs#tag/Object/operation/readObject) for more details.
+
+A common use case for COMS is to download a specific object from object storage.
+Depending on the `download` mode specified in the request, the COMS `readObject` endpoint will return one of the following:
+
+1. The file directly from S3, by first doing a HTTP 302 redirect to a temporary pre-signed S3 object URL
+2. The file streamed/proxied through COMS
+3. The temporary pre-signed S3 object URL itself
+
+COMS uses the redirect flow by default because it avoids unnecessary network hops. For significantly large object transactions, redirection also has the added benefit of maximizing COMS microservice availability. Since the large transaction does not pass through COMS, it is able to remain capable of handling other client requests.
+
+![COMS Network Flow](images/coms_network_flow.png)
+
+**Figure 2 - The general network flow for a typical COMS object request**
+
+## Basic Auth
+
+If [Basic Auth Mode](Configuration.md#basic-auth) is enabled in your COMS instance, requests to the COMS API can be authenticated using an HTTP Authorization header (type `Basic`) containing the username and password configured in COMS.
+
+This mode offers more direct access for a 'service account' authorized in the scope of the application rather than for a specific user and by-passes the COMS object/bucket permission model.
+
+Basic Auth mode is not available on the BC Gov hosted COMS service.
+
+## Unauthenticated Mode
+
+[Unauthenticated Mode](Configuration.md#unauthenticated-auth) configuration is generally recommended when you expect to run COMS in a highly secured network environment and do not have concerns about access control to objects as you have another application handling that already.
diff --git a/docs/Buckets.md b/docs/Buckets.md
@@ -0,0 +1,18 @@
+
+### Configuring Buckets
+
+- COMS is [configured with a 'default' bucket](Configuration.md#object-storage). Various object management endpoints will use this bucket if no `bucketId` parameter is provided. (**Note:** the default bucket fall-back behaviour is not available in the BC Gov Hosted COMS service.)
+
+- Additional buckets can be added to the COMS system using the [createBucket](https://coms.api.gov.bc.ca/api/v1/docs#tag/Bucket/operation/createBucket) endpoint.
+
+- When a bucket is created, if the createBucket API request is authenticated with a User ID token (JWT), that user will be granted all [5 permissions](Permissions.md#permission-codes). Bucket Permissions can be granted to other users ([bucketAddPermissions](https://coms.api.gov.bc.ca/api/v1/docs#tag/Permission/operation/bucketAddPermissions)), if the request is authenticated with a JWT for a user with `MANAGE` permission.
+
+If you are self-hosting COMS you can also manage permissions for any object or bucket by using these endpoints with [basic authentication](Authentication.md#basic-auth).
+
+### Using the Bucket **Key**
+
+When you create a bucket in COMS, technically you are 'mounting' your  S3 bucket (actual bucket provisioned) at a specified path in the `key` property of the [createBucket](https://coms-dev.api.gov.bc.ca/api/v1/docs#tag/Bucket/operation/createBucket) request body.
+
+COMS will only operate with objects at that 'folder' within the actual bucket. A COMS `bucket` can more accurately be thought of as a 'mount' to a single path within a bucket.
+
+To work with objects in 'sub-folders' (with other prefixes), you can create multiple COMS 'buckets' mounted at different paths by specifying different keys.