-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pkg/ottl] enablement for an unroll function/array expansion #36507
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
I looked into this a bit and have some questions about the implementation. The processor essentially executes with logic "for each log record, execute each statement", as opposed to "for each statement, transform each log record". This is completely natural given the way statement sequences are parsed and contexts are built, but it introduces a complication which I'll explain using an example: Say you have a simple
In my opinion, the most intuitive result for end users would retain the order of items:
However, you could also argue that either of the following would be acceptable:
In any case, once you consider how iteration is currently managed, it forces our hand in some sense. Specifically, we determine the length of the This means that we effectively MUST use the solution where the original is preserved and new records are appended to the end. (See "AMXBCYZ" solution above) IMO this is pretty ugly in terms of disrupting the intuitive order of records, but there is a tougher problem: Suppose you want to execute a sequence of statements that includes
Because we determine the slice to contain 3 records before any transformations are applied, will will actually get the following result:
What's happened here, is that statements before I think similar problem may occur when any function changes the length OR order of a slice. I'm curious if this has been discussed @evan-bradley, @TylerHelmuth. One possible solution would be to introduce some notion of "this function modifies the slice" which could be used to isolate such functions into dedicated statement sequences. By having exactly one such statement per sequence, it ensures that changes to the slice are not interleaved with unrelated transformations, and subsequent statement sequences will execute on the updated number of items. This would work arbitrarily but also allow for grouping of statements which do not modify the slice:
|
Component(s)
pkg/ottl, processor/transform
Is your feature request related to a problem? Please describe.
The general problem I have is that I have log data that I'd like to transform based off a separator, in my case
\n
within a string as the data is being sent to me.The transformprocessor enables me to split my log on newlines however it's all one entry still just with a singular slice body
What I'd like to be able to do is once I've split, be able to unroll this resulting array into new log entries
Example Log Line
<20>Oct 24 15:16:15 schmeler2853 inventore[8729]: We need to reboot the 1080p IB firewall!\n<162>Oct 24 15:16:16 ruecker1023 optio[97]: Navigating the microchip won't do anything, we need to program the multi-byte XML card!After Split
What I'd like to do next is implement some kind of function that creates new events based off that array i.e.
- unroll(body)
Result
Describe the solution you'd like
Since what I'm looking for is some kind of editor function that would be able to take an event and expand the log slice based off each individual value of an array specified within the
LogsContext
; however I imagine this could be useful in any of the telemetry contexts.This is sort of the inverse of what the aggregate_on_attributes function is doing in the metrics context, but for log slices.
Describe alternatives you've considered
I've glanced briefly at the transformprocessor directly and think we could maybe just solve it there; however some reprocessing of log entries is still making me hesitant if that's the correct place #36506. I'm not entirely sure where the best place to implement such a feature (I've looked briefly at implementing generically in OTTL and could not think of a good way to not re-iterate over the expanded logs with our current OTTL implementation). Ideally I'm looking for some guidance on if this is something we can/should do with OTTL or what the alternative solution we could use to handle this potential processor problem!
Additional context
Important
Not saying there's anything inherently wrong with the implementation of OTTL, I'm just creating this issue seeking some guidance on what is the recommended way of solving this processing scenario! If we want to solve it generically using the OTTL framework, was hoping to start identifying any next steps we could take to get an OTTL solution if thats the correct place to add the desired functionality.
The text was updated successfully, but these errors were encountered: