GitHub - guardian/flexible-snapshotter: Lambdas that snapshot the Flexible content CMS

Flexible Content Snapshotter (v2)

The repo is composed of two lambdas: the scheduler and the snapshotter.

Snapshotter schematic

                 +-----------+    Requests     +-------------+       +----+
                 |           |                 |             |       |    |
CloudWatch +-----> Scheduler +-----> SNS +-----> Snapshotter +-------> S3 |
CRON             |           |                 |             |       |    |
                 +-----+-----+                 +-----+-------+       +----+
                       |                             |
        Content ^5m?   |                             |  Content
                       |         +----------+        |
                       +--------->          <--------+
                                 | Flexible |
                                 | API      |
                                 |          |
                                 +----------+

Scheduler

The scheduler is run every five minutes using a CloudWatch event cron. When it runs it queries the Flexible Content API for any content that has changed in the last five minutes.

For each content ID that has been modified it posts a message to an SNS topic giving the content ID and the reason (scheduled snapshot).

Snapshotter

The snapshotter lambda fires on each request sent to the SNS topic. For each incoming request it makes a call to the Flexible API to get a complete copy of the piece of content at that moment (both live and preview facets).

It then stores the result as is in the configured S3 bucket using a key <contentId>/<timestamp>.json. Alongside the data file the snapshotter stores a second file called <contentId>/<timestamp>.info.json - which contains metadata (such as the reason in the request) and certain summary fields that are extracted from the main JSON. These summary fields are used to provide useful information about available snapshots without having to load them all into memory.

Running in AWS

Use the CloudFormation in the platform repo to stand this up for production.

Running locally

There are two mainclasses that can be used to run the Lambda locally for troubleshooting. This doesn’t exercise the configuration mechanism as the config is provided in the mainclasses.

Name		Name	Last commit message	Last commit date
Latest commit History 134 Commits
.github		.github
project		project
src		src
.gitignore		.gitignore
.tool-versions		.tool-versions
README.adoc		README.adoc
build.sbt		build.sbt
riff-raff.yaml		riff-raff.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Flexible Content Snapshotter (v2)

Scheduler

Snapshotter

Running in AWS

Running locally

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 13

Uh oh!

Languages

guardian/flexible-snapshotter

Folders and files

Latest commit

History

Repository files navigation

Flexible Content Snapshotter (v2)

Scheduler

Snapshotter

Running in AWS

Running locally

About

Topics

Resources

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 13

Uh oh!

Languages

Packages