feat(cu): introduce an EventVacuum that parses well-formatted event logs for transport to other services #1017

arielmelendez · 2024-09-18T19:12:08Z

Motivation:

Processes are essentially applications, and applications need various forms of observability tools - where "Observability" can be defined as "the ability to answer novel, open-ended questions about a system". The AO team is continuing to develop solutions for monitoring generic performance metrics for Processes, but a gap currently exists in the ability to measure richer contextual information from the internals of a Process.

Message handling in Processes is similar in many ways to handling HTTP requests on a server. A great way to get observability over a system like that is to use wide, "Structured Events" that are rich with relevant information about the inner workings of the process that you wouldn't be able to get from Process inputs (e.g. searchable and aggregatable from GQL data) or generic performance metrics. For more background reading on this approach and its benefits see:
https://charity.wtf/2022/08/15/live-your-best-life-with-structured-events/
and
https://docs.honeycomb.io/get-started/basics/observability/concepts/events-metrics-logs/

The challenge with extracting this type of information from Processes is that they run in a sandbox environment without access to a network or file system that can be connected to the outside world. Therefore, existing intra-AO-Process solutions such as AO subscribables don't quite fit this model AND would require gas for the messaging necessary to facilitate it. However, AO CU's have direct access to Process memory and outputs, including Process log streams. As such, log streams can be used as a transport mechanism to shuttle observability data out of AO and to the outside world.

Technical Contributions

This pull request introduces:

An EventVacuum class that:
- is opt IN via ENV var settings
- parses newline-delimited json (ndjson) events out of Process log streams that contain a _e: 1 key/value flag and sends them off to a transport layer
A set of event transport implementations:
- CompositeTransport: takes a list of transports and fans the events out to each of them
- ConsoleTransport: print events out to the CU's logger
- HoneycombTransport:
  - sends structured events to Honeycomb for analysis
  - provides a sqlite database integration to prevent from sending duplicate events to Honeycomb in subsequent runs of the CU

Results From Preliminary Testing

I created a utility module to produce and print compliant ndsjon events and instrumented a new AO token via the token.lua blueprint with it. You can find the code for those here: permaweb/aos#350
Preliminary test results using the Honeycomb Transport have been great. Here are some examples of what you can do with the integration:

List the errors that have been raised during processing, grouping by nonce, sender, Action, and error reason:

Aggregate the total value of the token that has been transferred in the last 48 hours:

Surface internal analytics for how how many times each specific handler has been successfully triggered on the Process:

... and that's just the start of what's possible.

I strongly believe that when other builders see that this kind of open-ended introspection is possible with these kinds of tools, they will want to give it a try! I'm also open to discussing other means of achieving this form of event transportation in AO.

…s to an event transport PE-6705

…once in local sqlite db PE-6706

… class PE-6706

PE-6706: honeycomb event transport with sqlite-managed nonce deduping

feat(cu): add event vacuum to eval to siphon well-formatted event logs to an event transport PE-6705

feat(cu): add a kinesis transport PE-6816

Ariel Melendez and others added 15 commits September 13, 2024 15:08

feat(cu): add event vacuum to eval to siphon well-formatted event log…

9750984

…s to an event transport PE-6705

feat(cu): add a Honeycomb event log transport that dedups events by n…

c6ece26

…once in local sqlite db PE-6706

refactor(cu): move tarnsports into their own file PE-6706

a816fab

feat(cu): make event vacuuming opt in via env var PE-6706

d53ef4b

feat(cu): memoize largest nonces PE-6706

9bdd533

feat(cu): use winston logger in transport classes PE-6706

45fbc81

feat(cu): memoize frequently used prepared sqlite statement PE-6706

0f2f6ac

chore(cu): logs cleanup PE-6706

346095d

chore(cu): use destructured object constructor for HoneycombTransport…

a837df5

… class PE-6706

feat(cu): normalize timestamps to ISO strings PE-6706

97eec20

Merge pull request #6 from ar-io/PE-6706_honeycomb_transport

f2d950c

PE-6706: honeycomb event transport with sqlite-managed nonce deduping

Merge pull request #5 from ar-io/event_vacuum

78afb00

feat(cu): add event vacuum to eval to siphon well-formatted event logs to an event transport PE-6705

feat(cu): add a kinesis transport PE-6818

66e62a6

feat(cu): set largest nonces correctly PE-6818

cd5e262

Merge pull request #7 from ar-io/PE-6818_kinesis_transport

cec7723

feat(cu): add a kinesis transport PE-6816

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(cu): introduce an EventVacuum that parses well-formatted event logs for transport to other services #1017

feat(cu): introduce an EventVacuum that parses well-formatted event logs for transport to other services #1017

arielmelendez commented Sep 18, 2024

feat(cu): introduce an EventVacuum that parses well-formatted event logs for transport to other services #1017

Are you sure you want to change the base?

feat(cu): introduce an EventVacuum that parses well-formatted event logs for transport to other services #1017

Conversation

arielmelendez commented Sep 18, 2024

Motivation:

Technical Contributions

Results From Preliminary Testing