Skip to content

Sample code to discover financial data in S3 using Amazon Macie and automatically apply lifecycle policies

License

Notifications You must be signed in to change notification settings

aws-samples/aws-financial-data-discovery-samples

Automate the archival and deletion of sensitive financial data using Amazon Macie

Table of contents

  1. Introduction
  2. Architecture
  3. Prerequisites
  4. Tools and services
  5. Usage
  6. Cost Estimate
  7. Clean up
  8. Reference
  9. Security
  10. License

Introduction

This project provides an example using Amazon Macie to discovery sensitive financial data stored in an Amazon S3 bucket. The S3 object data will be automatically tagged and an S3 bucket lifecycle policy will be applied to transition the objects into Amazon Glacier.

Architecture

architecture

  1. An Amazon Macie job is used to scan an [Amazon S3] bucket for objects containing sensitive financial information (credit card numbers, account numbers, etc)
  2. An Amazon EventBridge rule is used to capture the Amazon Macie findings.
  3. Amazon EventBridge then sends the findings into into an Amazon Kinesis Data Firehose.
  4. The Amazon Kinesis Data Firehose is used to batch the findings and aggregate them into an Amazon S3 results bucket.
  5. An Amazon S3 event notification is used to trigger an AWS Lambda function when new results are found in the bucket.
  6. The AWS Lambda function will add the Macie finding severity to the S3 object as a new tag. The function will also update the bucket lifecycle policy to automatically transition the object to Amazon Glacier a configurable number of days.

Prerequisites

Tools and services

  • AWS SAM - The AWS Serverless Application Model (SAM) is an open-source framework for building serverless applications. It provides shorthand syntax to express functions, APIs, databases, and event source mappings.
  • AWS Lambda - AWS Lambda is a serverless compute service that lets you run code without provisioning or managing servers, creating workload-aware cluster scaling logic, maintaining event integrations, or managing runtimes.
  • Amazon Macie - Amazon Macie is a fully managed data security and data privacy service that uses machine learning and pattern matching to discover and protect your sensitive data in AWS.
  • Amazon Kinesis Data Firehose - Amazon Kinesis Data Firehose is the easiest way to reliably load streaming data into data lakes, data stores, and analytics services.
  • Amazon EventBridge - Amazon EventBridge is a serverless event bus service that you can use to connect your applications with data from a variety of sources.

Usage

Parameters

Parameter Type Default Description
TagKey String Severity Tag key to use when tagging S3 object finding severity
SeverityThreshold String High Scoring threshold to tag S3 objects
SourceBucketName String None Optional S3 bucket containing potentially sensitive content (if not provided, a bucket will be created)
SourceBucketRetention Number 0 If creating a source bucket, what is the default object retention (in days). Set to zero to disable.
GlacierTransitionInDays Number 365 Number of days until objects are transitioned to Glacier
ExpireObjectsInDays Number 1825 Number of days until objects permanently expire

Installation

git clone https://github.com/aws-samples/aws-financial-data-discovery-samples
cd aws-financial-data-discovery-samples
sam build
sam deploy --guided

Cost Estimate

Please refer to the Amazon Macie Pricing page for details.

Clean up

Deleting the CloudFormation Stack will remove the Lambda functions, Kinesis Data Firehose and EventBridge rule. Ensure the S3 buckets are empty before attempting to remove them.

Reference

This solution is inspired by this original AWS Big Data Blog

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

About

Sample code to discover financial data in S3 using Amazon Macie and automatically apply lifecycle policies

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Contributors 3

  •  
  •  
  •