-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set up a logging service #1019
Comments
Comment in slack by @mogul
|
https://logs.fr.cloud.gov/ > Left Blade > Kibana > Discover |
Note:
|
@mogul mind if I get your assistance on this item? |
Let's review whether the New Relic agent is in fact picking up logs on its own or not. If not, then we should
|
For the latter point about shipping logs to an S3 bucket for consumption by the GSA SOCaaS: There's a bullet up here about that which hasn't been broken out yet. Let's consider that out of scope for this particular issue. (When we have time to take it on, that would likely work from this example, though I'd like to use a Terraform module to implement that.) |
@asteel-gsa, @akf has just gotten |
@mogul totally. So far our new relic implementation isn't there, pending resolution with NR support, but we can set it up whenever. Pending any backup/restore testing with JMM, or new relic suddenly working and digging into that task, should be free whenever to work on this with you. |
@asteel-gsa , let me know when you want to consider this issue closed. The ticket has almost nothing up top. Is acceptance "set up NR," or is it "set up NR and ship through |
AC would be Let's leave this in backlog for now, we do want to do this, but only after we have confirmed all environments reporting to NR. |
@mogul now that we have new relic configured properly, do you want to find some time to get this implemented? |
Sure, how about late this week or Monday the week after next? |
Works for me, we can aim for friday, ill put some time on the calendar |
Alex and I spent a good chunk of time today to sketch out the details, and we've groomed the initial post accordingly. If anyone has questions or concerns about this approach, now's a good time to bring them up, before we break ground! |
At a glance
In order to ensure logs appear in places where people can mine and alert on them
as a FAC devops-oriented person
I want cloud.gov app logs and metrics to be shipped to New Relic (for our own alerting purposes) and an S3 bucket (for the GSA SOC to ingest for GSA IT's alerting purposes).
Acceptance Criteria
We use DRY behavior-driven development wherever possible.
Scenario: Logs are flowing to New Relic
Given I am authenticated with New Relic
when I review logs for the gsa-fac app
...
then...
Scenario: Logs are flowing to the S3 bucket
Given I have a service-key for the FAC logs S3 instance
when I look at the content of the S3 bucket
...
then...
Shepherd
Background
cloud.gov doesn't offer alerting capabilities out of the box, so that's why we're going to ship logs off to New Relic, where we can set up alerts.
In addition, OMB directive M-21-31 says that agencies should stovepipe logs into a central agency-wide SOC. So that's why we're going to ship logs to an S3 bucket (that the GSA SOCaaS can pull from).
Security Considerations
Required per CM-4.
We are ensuring that the
cg-logshipper
app uses the egress proxy to communicate with New Relic, and the egress proxy requires client credentials. We're also ensuring that thecg-logshipper
app itself requires client credentials. Connections to brokered S3 buckets are already routed over a cloud.gov internal endpoint. In all hops (app to logshipper, logshipper to egress proxy, logshipper to New Relic, logshipper to S3) the traffic is secured with TLS.For our initial implementation the
cg-logshipper
app and S3 bucket will be in the same space as the apps whose logs it is draining. A team member acting as an insider threat could possibly tamper with the logshipper app or the bucket content using theirSpaceDeveloper
access. However, that's a remote concern. For our initial implementation we're considering that concern out of scope and we're noting mitigation of that concern as a "potential future enhancement" below. (Also note that the logs that go tologs.fr.cloud.gov
and New Relic are tamper-resistant and serve as a comparison point for the S3 content in case an insider threat is identified.)Sketch
We're thinking we'll write a Terraform module that deploys the cg-logshipper app, similar to the existing https-proxy module.
Since we're not all that familiar with the raw output from Cloud Foundry, it may be helpful to look at the cloud.gov ELK configuration to see how they're processing raw output from CF on its way into logs.fr.cloud.gov (where a bunch of fields are parsed out). Here are the ELK (old) Opensearch (new) versions of the logs.fr.cloud stack.
Tasks
Potential future enhancements (other stories)
Tasks
For machine identification: We want to have a concrete test that will sieve out lines specifically delivered by the logshipper to verify that everything is working, rather than having to check it's working as a human looking at the UI. In
logs.fr.cloud.gov
there's acf_origin:firehose
field; we are hoping we can implement something like that for the logshipper in New Relic.For moving the logshipper app and bucket to another space: This addresses a potential insider threat consideration, so they can't create service bindings and mess with the content of the S3 bucket; only admins (who have direct access to that other space) can do that.
Process checklist
Sketch
Definition of Done
Triage
If not likely to be important in the next quarter...
Otherwise...
Design Backlog
Design In Progress
Design Review Needed
Design Done
If no engineering is necessary
Engineering Backlog
Engineering Available
In Progress
columnEngineering In Progress
If there's UI...
Engineering Blocked
Engineering Review Needed
Engineering Done
The text was updated successfully, but these errors were encountered: