Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auotmate resource auditing for account #2148

Merged
merged 5 commits into from
Mar 28, 2024
Merged

Auotmate resource auditing for account #2148

merged 5 commits into from
Mar 28, 2024

Conversation

berryd
Copy link
Contributor

@berryd berryd commented Mar 25, 2024

Description

The purpose of this PR is to prototype a scheduled job which produces an automated report. The report is scheduled to run once a week, on Monday morning, and will create a downloadable zip file containing four distinct CSV files. The job also posts JSON output for each of the four reports within the build log, allowing viewing without needing to download and subsequently open the spreadsheets.

Under the hood, this report is scanning the tagging API output which returns all active resources. Two other approaches were investigated and the result of that work can be reviewed in comments in this ticket. The tagging API is a relatively new feature and allows for querying and applying tags to resources.

In our application stack, our CI pipeline applies a tag,{"Key":"STAGE","Value":"[stage_name]"}, to every resource directly created by the generated Cloudformation template. Some child resources, such as rules, do not inherit these mapping and will subsequently be reported in the untagged report. What this audit is effectively doing is comparing tags against branch names, using the derived branch name for Snyk and Dependabot generated stages. There are effectively four categories of reports for this resource audit:

CI Active

These resources were created by our CI pipeline, and have an active branch in the repository.

CI Inctive

These resources were created by our CI pipeline, but there is no matching branch. This is likely infrastructure that was not cleaned up during a destroy action, but in any case, this is orphaned infrastructure.

CF Other

These resources have tags and are assumed to have been created by a Cloudformation template but they do not have the STAGE tag and therefore were not created by our CI pipeline. These were likely created by CMS Cloud Engineering teams. It's possible to manually tag a resource, and those resources would also appear in this report.

Untagged

These resources have no tags. They could have been manually created, or they could be child resources that simply don't inherit tags.

Related ticket(s)

CMDCT-3410


How to test

Workflow (note that this will vanish after this PR is merged)

Download the artifacts from the above link, and verify the contents.

Important updates

At some point in the future we may want to automate reporting of certain items produced by these reports, such as active stages (did you push a branch and forget about it), and we could also amend this script to check for open PRs against branches. We could possibly introduce a culture change where we're expecting engineers to create draft PRs for any active branch and use this to report on forgotten work, but that's not something I'm necessarily advocating for at this time. At the minimum, this should improve observability to orphaned and neglected resources in any account to which this automated workflow is operating against.


Author checklist

  • I have performed a self-review of my code
  • I have added thorough tests, if necessary
  • I have updated relevant documentation, if necessary

convert to a different template: test → val | val → prod

@berryd berryd added the ready for review Ready for all the reviews! label Mar 26, 2024
@berryd berryd marked this pull request as ready for review March 26, 2024 16:48
gmrabian
gmrabian previously approved these changes Mar 27, 2024
Copy link
Contributor

@gmrabian gmrabian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not strong enough in bash to know if this works as intended but I do like the idea and see no harm in trying it

@berryd
Copy link
Contributor Author

berryd commented Mar 28, 2024

I'm not strong enough in bash to know if this works as intended but I do like the idea and see no harm in trying it

It really doesn't affect an app in any manner other than providing some observability. We still have to go hunt for the report, but I'm hoping once this is proven, we can use it to start automating notifications. One thing that is apparent, is that our destroy scripts are not very reliable and we're leaking a lot of infrastructure.

@berryd berryd closed this Mar 28, 2024
@berryd berryd reopened this Mar 28, 2024
Copy link

codeclimate bot commented Mar 28, 2024

Code Climate has analyzed commit 8b66f06 and detected 0 issues on this pull request.

The test coverage on the diff in this pull request is 100.0% (90% is the threshold).

This pull request will bring the total coverage in the repository to 73.4%.

View more on Code Climate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready for review Ready for all the reviews!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants