Skip to content

Commit

Permalink
18.3.5 versioned docs
Browse files Browse the repository at this point in the history
  • Loading branch information
etcart committed Sep 20, 2024
1 parent d5f711b commit 04ba7f9
Show file tree
Hide file tree
Showing 205 changed files with 12,824 additions and 0 deletions.
36 changes: 36 additions & 0 deletions website/versioned_docs/version-v18.3.5/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
---
id: cumulus-docs-readme
slug: /
title: Introduction
hide_title: false
---

This Cumulus project seeks to address the existing need for a “native” cloud-based data ingest, archive, distribution, and management system that can be used for all future Earth Observing System Data and Information System (EOSDIS) data streams via the development and implementation of Cumulus. The term “native” implies that the system will leverage all components of a cloud infrastructure provided by the vendor for efficiency (in terms of both processing time and cost). Additionally, Cumulus will operate on future data streams involving satellite missions, aircraft missions, and field campaigns.

This documentation includes both guidelines, examples, and source code docs. It is accessible at <https://nasa.github.io/cumulus>.

---

## Navigating the Cumulus Docs

### Get To Know Cumulus

* Getting Started - [here](getting-started.md) - If you are new to Cumulus we suggest that you begin with this section to help you understand and work in the environment.
* General Cumulus Documentation - [here](README.md) <- you're here

### Cumulus Reference Docs

* Cumulus API Documentation - [here](https://nasa.github.io/cumulus-api)
* Cumulus Developer Documentation - [here](https://github.com/nasa/cumulus) - READMEs throughout the main repository.
* Data Cookbooks - [here](data-cookbooks/about-cookbooks.md)

### Auxiliary Guides

* Integrator Guide - [here](integrator-guide/about-int-guide.md)
* Operator Docs - [here](operator-docs/about-operator-docs.md)

---

## Contributing

Please refer to: <https://github.com/nasa/cumulus/blob/master/CONTRIBUTING.md> for information. We thank you in advance.
19 changes: 19 additions & 0 deletions website/versioned_docs/version-v18.3.5/adding-a-task.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
id: adding-a-task
title: Contributing a Task
hide_title: false
---

We're tracking reusable Cumulus tasks [in this list](tasks.md) and, if you've got one you'd like to share with others, you can add it!

Right now we're focused on tasks distributed via npm, but are open to including others. For now the script that pulls all the data for each package only supports npm.

## The tasks.md file is generated in the build process

The tasks list in docs/tasks.md is generated from the list of task package names from the tasks folder.

:::caution

Do not edit the docs/tasks.md file directly.

:::
7 changes: 7 additions & 0 deletions website/versioned_docs/version-v18.3.5/api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
id: api
title: Cumulus API
hide_title: false
---

Read the Cumulus API documentation at [https://nasa.github.io/cumulus-api](https://nasa.github.io/cumulus-api)
72 changes: 72 additions & 0 deletions website/versioned_docs/version-v18.3.5/architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
---
id: architecture
title: Architecture
hide_title: false
---

## Architecture

Below, find a diagram with the components that comprise an instance of Cumulus.

![Architecture diagram of a Cumulus deployment](assets/cumulus-arch-diagram-2023.png)

This diagram details all of the major architectural components of a Cumulus deployment.

While the diagram can feel complex, it can easily be digested in several major components:

### Data Distribution

End Users can access data via Cumulus's `distribution` submodule, which includes ASF's [thin egress application](https://github.com/asfadmin/thin-egress-app), this provides authenticated data egress, temporary S3 links and other statistics features.

#### Data search

End user exposure of Cumulus's holdings is expected to be provided by an external service.

For NASA use, this is assumed to be [CMR](<https://earthdata.nasa.gov/eosdis/science-system-description/eosdis-components/cmr>) in this diagram.

### Data ingest

#### Workflows

The core of the ingest and processing capabilities in Cumulus is built into the deployed AWS [Step Function](https://aws.amazon.com/step-functions/) workflows. Cumulus rules trigger workflows via either Cloud Watch rules, Kinesis streams, SNS topic, or SQS queue. The workflows then run with a configured [Cumulus message](./workflows/cumulus-task-message-flow), utilizing built-in processes to report status of granules, PDRs, executions, etc to the [Data Persistence](#data-persistence) components.

Workflows can optionally report granule metadata to [CMR](<https://earthdata.nasa.gov/eosdis/science-system-description/eosdis-components/cmr>), and workflow steps can report metrics information to a shared SNS topic, which could be subscribed to for near real time granule, execution, and PDR status. This could be used for metrics reporting using an external ELK stack, for example.

#### Data persistence

Cumulus entity state data is stored in a [PostgreSQL](https://www.postgresql.org/) compatible database, and is exported to an Elasticsearch instance for non-authoritative querying/state data for the API and other applications that require more complex queries.

#### Data discovery

Discovering data for ingest is handled via workflow step components using Cumulus `provider` and `collection` configurations and various triggers. Data can be ingested from AWS S3, FTP, HTTPS and more.

#### Database

Cumulus utilizes a user-provided PostgreSQL database backend. For improved API search query efficiency Cumulus provides data replication to an Elasticsearch instance.

##### PostgreSQL Database Schema Diagram

![ERD of the Cumulus Database](assets/db_schema/relationships.real.large.png)

### Maintenance

System maintenance personnel have access to manage ingest and various portions of Cumulus via an [AWS API gateway](<https://aws.amazon.com/api-gateway/>), as well as the operator [dashboard](https://github.com/nasa/cumulus-dashboard).

## Deployment Structure

Cumulus is deployed via [Terraform](https://www.terraform.io/) and is organized internally into two separate top-level modules, as well as several external modules.

### Cumulus

The [Cumulus module](https://github.com/nasa/cumulus/tree/master/tf-modules/cumulus), which contains multiple internal submodules, deploys all of the Cumulus components that are not part of the `Data Persistence` portion of this diagram.

### Data persistence

The [data persistence](https://github.com/nasa/cumulus/tree/master/tf-modules/data-persistence) module provides the `Data Persistence` portion of the diagram.

### Other modules

Other modules are provided as artifacts on the [release](https://github.com/nasa/cumulus/releases) page for use in users configuring their own deployment and contain extracted subcomponents of the [cumulus](#cumulus) module. For more on these components see the [components documentation](deployment/components).

For more on the specific structure, examples of use and how to deploy and more, please see the [deployment](deployment) docs as well as the [cumulus-template-deploy](https://github.com/nasa/cumulus-template-deploy) repo
.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 3 additions & 0 deletions website/versioned_docs/version-v18.3.5/assets/interfaces.svg

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
<mxfile host="app.diagrams.net" modified="2021-10-18T20:02:11.331Z" agent="5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.82 Safari/537.36" etag="w9ykoA1aViqPtbmP1pFI" version="14.7.6" type="device"><diagram id="9OBVV0XeZTobMLSr5Xny" name="Page-1">7V1dd6I8Hv80nrN7oQcSIOGyre289ll3nN3p7M0cKqkyRbAQqj6ffhMBgSRWZgYQTzsXowkQ8Pd/fwkdwKvl5l3krBa3oUv8AdDczQCOBwAYFsTsg89s0xmAgJbOzCPPTef0YmLq/U2yyfy0xHNJXDmRhqFPvVV1chYGAZnRypwTReG6etpD6FfvunLmRJqYzhxfnv3muXSRzmJTK+bfE2++yO+sa9mRpZOfnE3EC8cN16UpeD2AV1EY0vTbcnNFfI5ejkt63c2Bo/sHi0hA61xgfrucP6K/Pr57WHxcRNf2KridDLNVnh0/yX5w9rB0myMQhUngEr6INoCX64VHyXTlzPjRNSM6m1vQpc9GOvsqP1R+BxJRsilNZQ/5joRLQqMtOyU7OtQty0wv2uYzmpGBuC5owKmQTi7KBNBgdqqTUX6+v0MBDvuS4aPG6mZC119/jqffby//Hq+/+D8/f/1P41g9eL5/FfphxMZBGJA9fBJWCkQPwpdfkWNnIxk6M2ffMnK21gBwSiazGgbOdeLF7tzmOM4QUbN0CTXLVqCGcQOo+bd4NtzcTscucv63CaznT7fvh6grdvtT7CSOUwgrhFCBXY5649jh49iRwL3g9qFAo8JWJeAYONH2jg9Gmgbzie8c9RHKh+NNRoV0tC2PJiTy2O8iUTZ5EHbiVoyRDHpZhhXMmM9FxHeo91w1YSqQsztMQo89SYWmWpWmEAurxGESzUh2YdnqSGsBjBhuurTeCGrFP1BdnTrRnFBp9R0v7OH4fYUEjrNHmFDfC8jV3q3QyhzCB1xGPOYofHbuiT8JY496YcCO3YeUhsvSCRe+N+cHaMgl0slGM0Z8zhFlTmMuworff7mZc3dq5KxjOIqDWJDeAYBj+wIZ/BJ2ouuxpdqQbMOGgmQDWSsia6Qww8Vs49YEKohn+ZSTx3uuENF6SrhzdfkQBnQY71zLC3YCMFabHUT5cfZtzj+nf02540bimDuF2aLsGXfrpqdIfMLApAIRaRQ+EoEeCuUrMYLIL0vPdfltlIq9qvpdL2JcmvIfk0xOm92vzvxpYDTFEiY0qiyBc8VeYgndtGWOAEZL/FDDTvZHmB/ZY8SeQqBvTMyxbVmgMR4JrjUyDNnRwQpb3ZZzmAd6LcjzpwxsvkZEnOV5iXTZG2lHlg1BvVsmzNmjLM0IKaS5La+3RpDVH2mOn3pjmk3Lqmma8ysbJ53sVU3/zQ3qU0LYBM+9RHyJYJXQwszGPRW/diTO0kT9a5imQuaADjqUOeOcZG7mh4m7duhsIYseMu0r/rvbFT3dEokIIVYQEYNOxc+UxW/GyJP4jEQsJGSf7OMfz57DPq72IP7zNcmfAYWwVrdVhNN1RWzdmvTJ2bF/BWRIvSUpkc0LnhkJOCFTAl5MPrwqylm6VaUc0ORQVDcV8taep6JyXPuYoAOaXcUO2brK5kBVOh2DluCzz8no+M7y3nVOFLXpQI7a0Mm9PV3lqafxFdcAtQM3KWrLHENG9ThZEu4yfk7RL6K39Aa9Dt/KWvDFnO8vcYIh+Y+WaaksmKayYA1k5NSsoErJ9VMT4ip6ho7qakLQRHlMDV8N/7tUrZj5Thx7sypiZOPRu9L3tDphZqOiOMEH29JALE3sax1DbaRp1qBS7NAtc9BkuSOtIQwO54YyHNJqQA1T3Jv6CRTrHczkVlf5hfqJKawllc2aq5ao0ZUji57wJ6jyp2E1W447yp/oXPnT0KQYFiM0QoKFqM+kulgwxBoaGahbPq3RZ9A5n2pVFkXHNGgtZnwpc3qGzKiLytK0GuNDhDWxEtk2G55VQep0KWxdTMZAWw7puw5q5MYRVQ6bbMgs4QR5Vdlr3UJVggGkzJ6Zilaf9rIwNfIIfYg99h2jLydhIFJpWtRA7VxdsqmRw+qRrqJk9eMhCWap6J1EbSFNLLzZcARkQnZde5OzMVfJMvETXv1eh9Hjg8+cnpLmekWKCyEgVd3YlIJoQO8ygZzLWtf+6kFMj7uYGfv2xXW0Rc/R1H/Xc8RilA31bqMXUCOJVpcbCg74XjpyIHrZRyqjfWzyvXys+UAlo/7xSKVn7Ca7rKZtiNFF/VhFw8JqliGt1jbTNZF6PBDqHs0V5qor76reqy+sw5dZVhFw/z471k7i9I0bARb8ccP67bgZAKG+atgdq7/z2O4AbFDFSbXZQV0/QK35ETV2O/TIiT+R5w6ZRw7EnKeFRhhL9Cs5h833tKtJKEexEVmFEb0uZRq0tD+d4c5U8Ktx36FuS9VvC4x0Rdt5LpvdtO2ovPdfq39D60Dj8rciYNu1L9PIm89JtOv/cQL+/y4f5Z5tSRxaTSWlTCw0t+iaLgd2Fuiyh9k6vEWlSdZIY3neQfvauQAbuqgjlGwAO20Qs1TOdfNs4DrUyRVFajQKxXAf/dFdXj1nwTrKxcj3RnbDVWdVXnJW3o+5Q8na2Z4mX8sMhJjYydsnTpetRU3viG8r2DHldjNQu2GqtYAHnUusKOyVhNbpoXuLFWvFilCKFU1ct+W2xTgRHYoT30VOkPjkFUeJkqoysLZvaDpZlIjPq8DaI3kDyvp45xKXew+SxE3cKC672blr/CZ/ueeFcQ/kT5Wl6Z+rYAp96bYBFMCpevpb8xTweTT1i8hZpqHqSeoYuxpltR5iB/PY8oTIyZHRlzHXp9dF44oWkVkYua9HudpCbQ5hueak57tIulGrchiRkinzQ9+INDSQXFjqlki27HumRJqMv7wRaMgcEeXe75ZIpHylpGwWeHfBNBsyL3MRzsPA8a+LWQGo4pzPIXf6dxT7SSjdZvA5CQ0HDfWMCa0PR/st7hcxouMf28fx3e2lfxc9xBZRtv8cft+mTM7afRN1KfPiU/bchA+h2FdsAEVDgtKIa6CJqEmN3lm9xueUu7uBIYe+hl4/9G0iR60moSpyamJ/9/Rh+hRfPzPMvobj+y87K6SMpJu42ZGt42da3mpsRzkwxayLKmJvazO5MdlcJR+eHs2x9222MT/Mtfc3dV4W3FZH4kFEy6Yqvrj55AfXk/8u7n88/QzuPhpakr9vti+tgYbYWYrErZi1OwMl+7J/YWDzjYEvQdt7OwykTQbYQMpNBq29XUCJ31m9Z+VkuxFB0Xq9Jx9Q1g/aM8NK8in8UNWOxDQn7QXzMzBeTe3rAUjuDNSwSuRa25KoJlkN5/fsdnK8pJvL5vHFYKov9hGKu8RNUf3Wto/7VOh+KVNYqjn7qIySa6Tp+6Pfuacv6XdwjeAlPqbf/+iPYui6sGvT1nSlcgcy04lvvm8s+XNWr908HeWETS6WJuc3OiVbjbfhvJFNIhvU5Ib9hsjGhsVfUUrVavHHqOD1/wE=</diagram></mxfile>
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
---
id: cloudwatch-retention
title: Cloudwatch Retention
hide_title: false
---

Our lambdas dump logs to [AWS CloudWatch](https://aws.amazon.com/cloudwatch/). By default, these logs exist indefinitely. However, there are ways to specify a duration for log retention.

## aws-cli

In addition to getting your aws-cli [set-up](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html), there are two values you'll need to acquire.

1. `log-group-name`: the name of the log group who's retention policy (retention time) you'd like to change. We'll use `/aws/lambda/KinesisInboundLogger` in our examples.
2. `retention-in-days`: the number of days you'd like to retain the logs in the specified log group for. There is a list of possible values available in the [aws logs documentation](https://docs.aws.amazon.com/cli/latest/reference/logs/put-retention-policy.html).

For example, if we wanted to set log retention to 30 days on our `KinesisInboundLogger` lambda, we would write:

```bash
aws logs put-retention-policy --log-group-name "/aws/lambda/KinesisInboundLogger" --retention-in-days 30
```

:::note more about the aws-cli log command

The aws-cli log command that we're using is explained in detail [here](https://docs.aws.amazon.com/cli/latest/reference/logs/put-retention-policy.html).

:::

## AWS Management Console

Changing the log retention policy in the AWS Management Console is a fairly simple process:

1. Navigate to the CloudWatch service in the AWS Management Console.
2. Click on the `Logs` entry on the sidebar.
3. Find the Log Group who's retention policy you're interested in changing.
4. Click on the value in the `Expire Events After` column.
5. Enter/Select the number of days you'd like to retain logs in that log group for.

![Screenshot of AWS console showing how to configure the retention period for Cloudwatch logs](../assets/cloudwatch-retention.png)

## Terraform

Cumulus modules create cloudwatch log groups and manage log retention for a subset of lambdas and tasks. These log groups have a default log retention time, but, there are two optional variables which can be set to change the default retention period for all or specific Cumulus managed cloudwatch log groups through deployment. For cloudwatch log groups which are not managed by Cumulus modules, the retention period is indefinite or `Never Expire` by AWS, cloudwatch log configurations for all Cumulus lambdas and tasks will be added in a future release.

There are optional variables that can be set during deployment of cumulus modules to configure
the retention period (in days) of cloudwatch log groups for lambdas and tasks which the `cumulus`, `cumulus_distribution`, and `cumulus_ecs_service` modules supports (using the `cumulus` module as an example):

```tf
module "cumulus" {
# ... other variables
default_log_retention_days = var.default_log_retention_days
cloudwatch_log_retention_periods = var.cloudwatch_log_retention_periods
}
```

By setting the below variables in `terraform.tfvars` and deploying, the cloudwatch log groups will be instantiated or updated with the new retention value.

### default_log_retention_periods

The variable `default_log_retention_days` can be configured in order to set the default log retention for all cloudwatch log groups managed by Cumulus in case a custom value isn't used. The log groups will use this value for their retention, and if this value is not set either, the retention will default to 30 days. For example, if a user would like their log groups of the Cumulus module to have a retention period of one year, deploy the respective modules with the variable in the example below.

#### Example

```tf
default_log_retention_periods = 365
```

### cloudwatch_log_retention_periods

The retention period (in days) of cloudwatch log groups for specific lambdas and tasks can be set
during deployment using the `cloudwatch_log_retention_periods` terraform map variable. In order to
configure these values for respective cloudwatch log groups, uncomment the `cloudwatch_log_retention_periods` variable and add the retention values listed below corresponding to the group's retention you want to change. The following values are supported correlating to their lambda/task name, (i.e. "/aws/lambda/prefix-DiscoverPdrs" would have the retention variable "DiscoverPdrs" )

- ApiEndpoints
- AsyncOperationEcsLogs
- DiscoverPdrs
- DistributionApiEndpoints
- EcsLogs
- granuleFilesCacheUpdater
- HyraxMetadataUpdates
- ParsePdr
- PostToCmr
- PrivateApiLambda
- publishExecutions
- publishGranules
- publishPdrs
- QueuePdrs
- QueueWorkflow
- replaySqsMessages
- SyncGranule
- UpdateCmrAccessConstraints

:::note

`EcsLogs` is used for all cumulus_ecs_service tasks cloudwatch log groups

:::

#### Example

```tf
cloudwatch_log_retention_periods = {
ParsePdr = 365
}
```

The retention periods are the number of days you'd like to retain the logs in the specified log group for. There is a list of possible values available in the [aws logs documentation](https://docs.aws.amazon.com/cli/latest/reference/logs/put-retention-policy.html).
Loading

0 comments on commit 04ba7f9

Please sign in to comment.