-
Notifications
You must be signed in to change notification settings - Fork 1
ENG-893 AWS Backup Restore Testing Validation & Integrity Design #77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| { | ||
| "recoveryPointArn": "arn:aws:backup:...:recovery-point:...", // optional override | ||
| "expectedKeys": ["path/example1.txt", "path/example2.txt"], // validator-specific | ||
| "expectedMinObjects": 10 // optional fallback | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| { | |
| "recoveryPointArn": "arn:aws:backup:...:recovery-point:...", // optional override | |
| "expectedKeys": ["path/example1.txt", "path/example2.txt"], // validator-specific | |
| "expectedMinObjects": 10 // optional fallback | |
| } | |
| { | |
| "recoveryPointArn": "arn:aws:backup:...:recovery-point:...", // optional override | |
| "validationArgs": { | |
| "expectedKeys": ["path/example1.txt", "path/example2.txt"], // validator-specific | |
| "expectedMinObjects": 10 // optional fallback | |
| } | |
| } |
We should be as blind as possible to what we're passing the validation function, and not let the parameters we hand on to the validator pollute the namespace of the arguments to the orchestrator itself.
| "resourceType": "S3", | ||
| "createdResourceArn": "arn:aws:s3:::restored-bucket", | ||
| "targetBucket": "restored-bucket", | ||
| "s3": { "bucket": "restored-bucket" }, | ||
| "expectedKeys": ["..."], | ||
| "expectedMinObjects": 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following on from here, we want to make the function signature at this level agnostic to the resources that have been restored. So we could do something like this:
| "resourceType": "S3", | |
| "createdResourceArn": "arn:aws:s3:::restored-bucket", | |
| "targetBucket": "restored-bucket", | |
| "s3": { "bucket": "restored-bucket" }, | |
| "expectedKeys": ["..."], | |
| "expectedMinObjects": 10 | |
| "args": { // validationArgs from above, verbatim | |
| "expectedKeys": ["..."], | |
| "expectedMinObjects": 10 | |
| }, | |
| "resources": [ | |
| { | |
| "resourceType": "S3", | |
| "createdResourceArn": "arn:aws:s3:::restored-bucket", | |
| "targetBucket": "restored-bucket", | |
| "s3": { "bucket": "restored-bucket" }, | |
| } | |
| ] |
Although that might be more than we need - can we get away with this?
| "resourceType": "S3", | |
| "createdResourceArn": "arn:aws:s3:::restored-bucket", | |
| "targetBucket": "restored-bucket", | |
| "s3": { "bucket": "restored-bucket" }, | |
| "expectedKeys": ["..."], | |
| "expectedMinObjects": 10 | |
| "args": { // validationArgs from above, verbatim | |
| "expectedKeys": ["..."], | |
| "expectedMinObjects": 10 | |
| }, | |
| "resources": [ | |
| "arn:aws:s3:::restored-bucket" | |
| ] |
That is, is it feasible for us to only pass in the ARN of the retored resource?
| Input delivered to customer Lambda (superset of invocation + restore context): | ||
|
|
||
| ```json | ||
| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| { | |
| { | |
| "version": "1.0.0", |
Just to give ourselves some wiggle room here, let's include a version number so we can evolve the contract without too much hair-pulling.
|
|
||
| - Orchestrator policy limited to listing recovery points, starting & describing restore jobs, publishing validation, invoking a single validator ARN. | ||
| - Validator policy scoped to specific target bucket ARNs (S3 example). | ||
| - Sensitive data avoidance: orchestrator does not log object contents, only metadata. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might, possibly, include sensitive data. People could be putting anything in filenames, especially on S3 buckets where we're receiving files from third parties. I think we do need to call that out.
|
|
||
| ## 8. Extensibility | ||
|
|
||
| - Add multi-resource batch mode via Step Functions if needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, see #76 (comment). We need to be able to support validation across more than one resource type within the same validator. We do still want batching to support splitting the validation up on semantic or performance lines, but that's not for multi-resource support.
| "resourceType": "S3", | ||
| "createdResourceArn": "arn:aws:s3:::restored-bucket", | ||
| "targetBucket": "restored-bucket", | ||
| "s3": { "bucket": "restored-bucket" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above
| resource_type = "S3" | ||
| validation_lambda_arn = aws_lambda_function.customer_validator.arn | ||
| target_bucket_name = var.target_restore_bucket |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above
|
|
||
| ## Operational Notes | ||
|
|
||
| - Timeouts: Orchestrator Lambda default timeout is 15 minutes; long restores will exceed this—use small test datasets or adapt to Step Functions if needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That adaptation isn't something users need to worry about. We'll pick that up as and when it's needed in the blueprint itself.
|
|
||
| - Timeouts: Orchestrator Lambda default timeout is 15 minutes; long restores will exceed this—use small test datasets or adapt to Step Functions if needed. | ||
| - Costs: Avoid listing millions of S3 keys in the validator; prefer sampling. | ||
| - IAM Hardening: Current policy uses broad `backup:*` subset and `s3:Get*`; tighten to specific ARNs in production. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again I think we need to do some tightening ourselves here. The user is just importing the module, not modifying its contents.
| variable "target_bucket_name" { | ||
| type = string | ||
| description = "For S3 restores: name of the destination S3 bucket that the restore will produce or populate. Used only in the example orchestrator logic." | ||
| default = null | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above. Plus we don't want specifics for the example in the module interface.
Description
AWS Backup Restore Testing Validation & Integrity Design
Type of changes
Checklist
Sensitive Information Declaration
To ensure the utmost confidentiality and protect your and others privacy, we kindly ask you to NOT including PII (Personal Identifiable Information) / PID (Personal Identifiable Data) or any other sensitive data in this PR (Pull Request) and the codebase changes. We will remove any PR that do contain any sensitive information. We really appreciate your cooperation in this matter.