-
Notifications
You must be signed in to change notification settings - Fork 412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Turn evt def instantiable #3293
Turn evt def instantiable #3293
Conversation
Still needs some changes (like GetParams() being called multiple times for the same slice, forgot to change) but I think it can go through a first review round. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stopping the review at this point since I believe all my next comments would be similar.
Main point I have is to consider if we actually need a lock on a lot of the fields we added them for, instead of just making them immutable by design.
@@ -24,6 +24,14 @@ import ( | |||
) | |||
|
|||
func GetTraceeRunner(c *cobra.Command, version string) (cmd.Runner, error) { | |||
// Initialize event definitions | |||
|
|||
// events.Definitions = events.NewEventGroup() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leftover?
|
||
type Dependencies struct { | ||
events map[ID]struct{} // map[eventID]struct{} | ||
probes map[probes.Handle]*Probe // map[handle]*ProbeDependency |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it actually necessary for the map values to be pointers? Would map[probes.Handle]Probe
not work as well?
nit: ProbeSettings as a name, just since the Probe
is already a type of the probes
package, this struct is more about loading settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About being a pointer: the "probe dependency" has a "required/not_required" bool (with set/unset/read methods) that needs to be atomic (and singular) among all execution threads.
About the need for an instance: is "required/not_required" something we would change runtime? I believe it would when adding/removing policies).
About the naming, unfortunately there is a similar problem to all "dep_XXX" types:
Event: is actually EventDefinition and not "an event" (confuses with Tracee Event). Probe is ProbeDependency. KSymbol is a KSymbolDependency. Capabilities are CapabilitiesNeeded.
So, at least for now, I kept:
- event_group
- event
- type_XXX (for types used in dependencies)
- dep_YYY (for methods of types used in dependencies)
But I'm opened to suggestions (so we create a pattern for all of them together, not a single one).
Update: check #3293 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About the need for an instance: is "required/not_required" something we would change runtime? I believe it would when adding/removing policies).
I don't think there is such a need. When writing the event, the author decides if a probe is required or not for the event to function correctly (or at least partially work). This information about the event should be something that can be changed in runtime
eventsLock *sync.RWMutex | ||
probesLock *sync.RWMutex | ||
kSymbolsLock *sync.RWMutex | ||
tailCallsLock *sync.RWMutex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, tails calls are the only dependency we modify in runtime (which I used as a bit of a hack when we did syscall optimization a while back). I think it's not very ideal in the first place that we can modify dependencies.
IMO instead of adding locks, the best solution is to just leave what shouldn't be mutable as immutable. I think for all the above (and we should later do the same for tails calls), there's no need to lock, the solution should be not allowing any mutation after initial setting. So the setters
should be removed and only Get
methods should be left.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree BUT I don't know what will be mutable or not (yet) until the API server is designed and the extensions feature is finished. My syn here is to allow setters/getters for most of the fields and I'm not against disallowing that after the API server (and other features) are done. I needed something to start with and chose to allow everything first and reduce as appropriate (instead of the opposite).
update: check last comment on this thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About tailcalls being mutable or not: I can think of multiple situations where adding/removing indexes would be done during runtime (during a reconfiguration of policies and traced events).
And things such as (which I think you are referring):
become natural IF/WHEN we follow the encapsulation design pattern (which I think we should have done from the beginning as it allows us to protect data from inside the interface). Keeping local (from caller) slices/maps of info for definition updates brough us significant parallelism problems.
Like I said, until our reconfiguration capabilities is finished we don't know what should be instantiable and what shouldn't.
update: check last comment on this thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also something else I just thought.. by having event instances we can now move things out of Tracee singleton, such as eventConfig. The "submit/emit" fields are global and should be atomic and concurrently updated if/when needed.
Your "hack" for the tailcalls was also because we didn't have the "dynamic nature" of a "event definition" well defined. In your case, we started having instances in the definition, in the eventConfig case we started keeping track of things externally to the interface.
If you note, we had a "mix" of mutable vs immutable fields because we thought the definition of event would be "always static". By making it not "static" we can encapsulate the interface, better control concurrency and have a single point of view for atomic settings.
The event definition would become (and is, already, partially is) "event definition and status". WDYT ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the same like Nadav. In my view, event definition should only reflect "pure definition" of the event, and be immutable. Only the fields that we consider "status" (not sure that we have any of these today) and perhaps fields that can potentially be user configuable in the future (e.g. metadata fields of the event) should have protection, and then we should consider moving them into another structure (which will be protected, of course).
I think that doing such a seperation between event definition, which is immutable, and fields that can be dynamically changed in a different struct will be a better abstraction, cause less potential concurrency and performance issues, and also let us postpone the decision about which fields should be dynamic.
With that said, I also think it is a good thing to make all the event definition fields private, and let the user access them only with getters like you did.
// EventGroup is a struct describing a collection of events. | ||
type EventGroup struct { | ||
events map[ID]*Event | ||
mutex *sync.RWMutex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a thought: Is there any reason that events should be added post tracee loading? We could potentially run lockless after starting tracee by not allowing any event definition loading after a certain point.
Suggesting this because we have some Getters used in the pipeline itself so they are hotpaths running RLock().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do share your concern on locking contention, but, at the same time, we first need to make sure we protect what needs protection.
I'm not against removing the locks but a RLock is literally translated into a single atomic read (if there wasn't a write the status is even cached).
About adding events during runtime: You started tracee with core events, now you will enable network events, and later you will enable plugin01 events. Or, you just loaded new signature (which are events now) runtime, etc, etc.
I cannot map everything we want dynamic (or not) without first implementing it and the only way I found to be achievable was by making sure we follow concurrency patterns in an adequate way first.
Update: also related: #3293 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just spoke to @geyslan about this. I'll provide a full study on lock contention for this change so there is no doubt about being safe to have or not (and that will allow me to continue with the rest without regrets if we can perform with "no significant, or none, impact".
id32Bit *atomic.Uint32 | ||
name string | ||
docPath string | ||
strMutex *sync.RWMutex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strMutex
as in for name and docpath? Not sure these should be mutable at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto (about now knowing what should or shouldn't be changed runtime). There are things that CLEARLY shouldn't: ksymbol name, it is literally the "handle" that describes a ksymbol dependency and won't change at all. But many of other fields are "debatable".
dependencies *atomic.Pointer[Dependencies] | ||
sets map[string]struct{} | ||
setsMutex *sync.RWMutex | ||
params []trace.ArgMeta |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's take the opportunity to rename these to args
(or whatever was decided in the new event format). As this is also a definition, not sure this should be dynamic (should new argument definitions be appendable post definition?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. I didn't think event arguments would be added/removed runtime for this work, it goes together with many other previous comment of first guaranteeing a stable interface for concurrency and later using it and refining.
Yep, I think you are right and I do have that in my mind. Unfortunately we don't know what will be mutable or not (yet ?) so I thought about making most of them capable of being set/got (and we can always make them immutable later). This is not a big change (to allow/disallow them to mutate). As long as they're part of the "instance" definition (so what is mutable is always on the same state during the program execution). |
Make sure all types are thread-safe (for dynamic changes from API server).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice change! I like the encapsulation done for the different structs.
Some comments follow
expectedEvents: []events.Event{ | ||
events.NewEventDefinition("fake_event_0", []string{"signatures", "default"}, []events.ID{events.HookedSyscalls}), | ||
expectedEvents: []*events.Event{ | ||
events.NewEvent( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using NewEvent
instead of NewEventDefinition
may be misleading.
We don't create a new event here ("sourcing" the event) but defining a new event.
If you think definition is not the correct word to use then we can change (e.g. to NewEventConfiguration
or similar) but I think NewEvent
by itself is not enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree and was expecting the definitions to be discussed and done. EventDefinition is good enough for me, I was just worried because we use event := NewEvtDefinition() in many places, its misleading there as well (to understand if its a definition or a Tracee.Event).
I'll try to change this to make it more clear.
events.ID(6001), // id, | ||
events.Sys32Undefined, // id32 | ||
"fake_event_0", // eventName | ||
"", // docPath | ||
false, // internal | ||
false, // syscall | ||
[]string{"signatures", "default"}, // sets | ||
events.NewDependencies( | ||
[]events.ID{events.HookedSyscalls}, // ids | ||
nil, // probes | ||
nil, // ksyms | ||
nil, // tailcalls | ||
nil, // capabilities | ||
), | ||
[]trace.ArgMeta{}, | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to pass this as a struct and not with arguments to a function.
This way, you can also used named parameters, e.g.:
eventName: fake_event_0
events.NewEventDefinition("fake_event_1", []string{"signatures", "default"}, []events.ID{events.Ptrace}), | ||
events.NewEventDefinition("fake_event_2", []string{"signatures", "default"}, []events.ID{events.SecurityFileOpen, events.SecurityInodeRename}), | ||
expectedEvents: []*events.Event{ | ||
events.NewEvent( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't find the definition of NewEvent in this commit. Is it a new function? If so, it needs to be defined in this commit and not in the next one (or maybe this commit should come after the other one?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commits were inverted due to rebasing needs (likely). I can revert the commits and make tests come after.
var err error | ||
var output flags.PrepareOutputResult | ||
|
||
output, err = flags.TraceeEbpfPrepareOutput(c.StringSlice("output"), false) | ||
output, err := flags.TraceeEbpfPrepareOutput(c.StringSlice("output"), false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can remove the declaration of the output variable as well
eDependencies := evtDef.GetDependencies() | ||
if eDependencies == nil { | ||
return errfmt.Errorf("event id %d has nil dependencies", eventId) | ||
} | ||
evtDepEvtsIDs := eDependencies.GetEvents() | ||
if evtDepEvtsIDs == nil { | ||
return errfmt.Errorf("event id %d has nil event dependencies", eventId) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it valid that event's dependencies will be nil?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that all dep_xxx.go files can go into dep.go.
This split creates too many files IMO which are not really necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had same doubt. It got big, but I guess its ok since its all dependencies.
|
||
func makeDeriveBase(eventID events.ID) deriveBase { | ||
def := getEventDefinition(eventID) | ||
if events.Core == nil { | ||
fmt.Printf("event definitions not found\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use the logger instead of printf.
This comment applies to all places where printf is used in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, forgot about this (to convert them). Will do for sure.
// | ||
|
||
// Event is a struct describing an event configuration. | ||
type Event struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider renaming to EventDefinition
id *atomic.Uint32 | ||
id32Bit *atomic.Uint32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a reason for the atomicity of these fields or any of the others (see my previous comment about event defintion immutability
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered everything to be thread-safe. I don't know when events would be added, by how many threads or how they would create them (today we create Signature Events, etc...). But I got what you are saying.
e.mutex.RLock() | ||
defer e.mutex.RUnlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mutex has any meaning here?
Unless used with another operation, I don't think it has
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a single mutex for any map operation. Reading the length of a map isn't thread-safe (nor atomic) in Golang and, in order for this function to be thread-safe you need to read-lock the entire map, the way I see.
Okay, Event Definition in a declarative way, and immutable.
Ok, I'll wrap a event definition into a "events used" "state pattern design" (that is what you two are requesting basically), where I can save states of immutable objects and control thread-concurrency only for the mutable states.
Thanks, no need for setters on immutable fields though. I'll close this PR now and come up with something else then. |
commit f46fa02 (HEAD -> turn-evt-def-instantiable, rafaeldtinoco/turn-evt-def-instantiable)
Author: Rafael David Tinoco rafaeldtinoco@gmail.com
Date: Thu Jun 29 22:52:53 2023
commit 6b1a9be
Author: Rafael David Tinoco rafaeldtinoco@gmail.com
Date: Thu Jun 29 22:53:05 2023