Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUMULUS-3960: Updated PostToCmr task to be able to republish granules #3906

Merged
merged 18 commits into from
Jan 31, 2025

Conversation

jennyhliu
Copy link
Contributor

@jennyhliu jennyhliu commented Jan 21, 2025

Summary: Summary of changes

Addresses CUMULUS-3960: Update CMR metadata with new collection info

Changes

  • Updated PostToCmr task to be able to republish granules

Testing

  • Verified that PostToCmr can republish granule:
    • Added a workflow with the RepublishTest:
        "RepublishTest": {
        "Parameters": {
          "cma": {
            "event.$": "$",
            "ReplaceConfig": {
              "FullMessage": true
            },
            "task_config": {
              "bucket": "{$.meta.buckets.internal.name}",
              "stack": "{$.meta.stack}",
              "cmr": "{$.meta.cmr}",
              "launchpad": "{$.meta.launchpad}",
              "etags": "{$.meta.file_etags}",
              "concurrency": 10,
              "republish": true
            }
          }
        },
        "Type": "Task",
        "Resource": "${post_to_cmr_task_arn}",
        "Next": "BackupGranulesToLzards",
        "Retry": [
          {
            "ErrorEquals": [
              "Lambda.ServiceException",
              "Lambda.AWSLambdaException",
              "Lambda.SdkClientException"
            ],
            "IntervalSeconds": 2,
            "MaxAttempts": 6,
            "BackoffRate": 2
          }
        ]
      },
    
    • Updated spec/parallel/orca/OrcaBackupAndRecoverySpec.js to ingest a granule only (xdescribe any tests except S3 Ingest Granules, comment out the cleanup code)
    • Got the output cumulus message in s3 from IngestAndPublishGranuleWithOrca workflow execution.
    • Manually updated .cmr.xml in s3 to have collection ShortName MOD09GQ-> MOD09GQ-AZ
    • Ran the workflow with the cumulus message we got from IngestAndPublishGranuleWithOrca with etags removed. We can update etags as well.
    • Verified CMR granule has updated collection ShortName: https://cmr.uat.earthdata.nasa.gov/search/granules.umm_json?provider=CUMULUS&granule_ur=MOD09GQ.A5513498.Nji_76.006.6698559176437
      Note: The cmr concept-id remains the same in my test

PR Checklist

  • Update CHANGELOG
  • Unit tests
  • Ad-hoc testing - Deploy changes and test manually
  • Integration tests

.eslintrc.js Outdated
@@ -62,6 +62,7 @@ module.exports = {
},
globals: {
JSX: true,
AggregateError: false,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to resolve eslint error error 'AggregateError' is not defined. Later versions of eslint have resolved the error.

const results = await pMap(
updatedCMRFiles,
(cmrFile) => publish2CMR(cmrFile, cmrSettings, cmrRevisionId),
{ concurrency }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stopOnError: true is default, if set to false, it will change current lambda behavior

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jonathan suggested to have the promise.allsettled behavior, which is adding stopOnError: true to pMap, like
{ concurrency, stopOnError: true }, but that changes the current behavior of post-to-cmr, unit tests fail, and out of the scope of this ticket.
The comment explains why it's not added.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohhh ok, that makes sense, thanks for lmk 🙌

@@ -5,6 +5,6 @@
],
"statements": 94.0,
"functions": 88.0,
"branches": 97.0,
"branches": 88.0,
Copy link
Contributor Author

@jennyhliu jennyhliu Jan 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the handler function is not covered as before, not sure how to improve this.

@jennyhliu jennyhliu marked this pull request as ready for review January 29, 2025 15:10
Copy link
Contributor

@Nnaga1 Nnaga1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some questions/comments I had, one or two small NITs, once answered + some local testing I'll approve 🙌

packages/cmrjs/src/cmr-utils.js Show resolved Hide resolved
@@ -233,6 +243,17 @@ async function publish2CMR(cmrPublishObject, creds, cmrRevisionId) {
throw new Error(`invalid cmrPublishObject passed to publis2CMR ${JSON.stringify(cmrPublishObject)}`);
}

/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason why this wasn't here before? did we never have the case of having to remove from CMR until now or was there just another way to do it (via API or something)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mainly for my curiousity I guess

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

api package has _removeGranuleFromCmr which does more stuff

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this is like a mini-function of that kinda where it only deletes, sounds good and pretty useful maybe 👍

tasks/post-to-cmr/README.md Show resolved Hide resolved
* @throws {Error} - Error from CMR request
*/
async function removeGranuleFromCmr({ granules, cmrSettings, concurrency }) {
const granulesToUnpublish = granules.filter((granule) => granule.published || !!granule.cmrLink);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its considered published if it doesnt have a granule.cmrLink ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory published granule should have cmrLink, but we should be able to delete the granule from cmr if we don't have the link

tasks/post-to-cmr/index.js Show resolved Hide resolved
cmrClient.CMR.prototype.ingestGranule.restore();
});

await s3PutObject({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get the other calls/functions but what does this s3PutObject do in the case of the test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

postToCMR function reads the cmr metadata file and ingest to CMR.

Copy link
Contributor

@Nnaga1 Nnaga1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my questions and comments! LGTM 🚀 🎈

@Nnaga1 Nnaga1 removed the in review label Jan 31, 2025
@jennyhliu jennyhliu merged commit 74a0648 into master Jan 31, 2025
3 checks passed
@jennyhliu jennyhliu deleted the jl/CUMULUS-3960 branch January 31, 2025 17:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants