Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve flag parsing performance continuation #4198

Merged
merged 2 commits into from
Aug 23, 2024

Conversation

geyslan
Copy link
Member

@geyslan geyslan commented Jul 23, 2024

1. Explain what the PR does

3adc191 chore(events): parseSyscall with no lock

Bypass the lock contention accessing the read-only map directly.

115a85c chore(events): reduce ParseArgs complexity

This reduces the complexity of the ParseArgs function by extracting
different parts of it into separate helpers. This makes the code
easier to read and understand and also makes it smaller.

Before:
1a080c0      14863 T github.com/aquasecurity/tracee/pkg/events.ParseArgs

After:
1a0cee0      11632 T github.com/aquasecurity/tracee/pkg/events.ParseArgs

It's a reduction of 21.73% in the size of the function what should also
improve cache locality and performance.

The benchmark tests also show a significant improvement in performance,
with a reduction of 6.4% in the time it takes to parse the arguments.

Running tool: /home/gg/.goenv/versions/1.22.4/bin/go test -benchmem
-run=^$ -tags ebpf -bench ^(BenchmarkParseArgsPrev|BenchmarkParseArgs)$
github.com/aquasecurity/tracee/pkg/events -benchtime=20000000x -race

goos: linux
goarch: amd64
pkg: github.com/aquasecurity/tracee/pkg/events
cpu: AMD Ryzen 9 7950X 16-Core Processor
=== RUN   BenchmarkParseArgsPrev
BenchmarkParseArgsPrev
BenchmarkParseArgsPrev-32               20000000              2002 ns/op               0 B/op          0 allocs/op
=== RUN   BenchmarkParseArgs
BenchmarkParseArgs
BenchmarkParseArgs-32                   20000000              1873 ns/op               0 B/op          0 allocs/op

2. Explain how to test it

3. Other comments

Note that the benchmark measurements are non-deterministic.

@geyslan

This comment was marked as outdated.

This reduces the complexity of the ParseArgs function by extracting
different parts of it into separate helpers. This makes the code
easier to read and understand and also makes it smaller.

Before:
1a080c0      14863 T github.com/aquasecurity/tracee/pkg/events.ParseArgs

After:
1a0cee0      11632 T github.com/aquasecurity/tracee/pkg/events.ParseArgs

It's a reduction of 21.73% in the size of the function what should also
improve cache locality and performance.

The benchmark tests also show a significant improvement in performance,
with a reduction of 6.4% in the time it takes to parse the arguments.

Running tool: /home/gg/.goenv/versions/1.22.4/bin/go test -benchmem
-run=^$ -tags ebpf -bench ^(BenchmarkParseArgsPrev|BenchmarkParseArgs)$
github.com/aquasecurity/tracee/pkg/events -benchtime=20000000x -race

goos: linux
goarch: amd64
pkg: github.com/aquasecurity/tracee/pkg/events
cpu: AMD Ryzen 9 7950X 16-Core Processor
=== RUN   BenchmarkParseArgsPrev
BenchmarkParseArgsPrev
BenchmarkParseArgsPrev-32               20000000              2002 ns/op               0 B/op          0 allocs/op
=== RUN   BenchmarkParseArgs
BenchmarkParseArgs
BenchmarkParseArgs-32                   20000000              1873 ns/op               0 B/op          0 allocs/op
@geyslan geyslan marked this pull request as ready for review August 6, 2024 20:20
@geyslan geyslan requested a review from rscampos August 6, 2024 20:22
arg.Type = "string"
}
// bypass the lock contention accessing the read-only map directly
def, ok := CoreEvents[ID(id)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any issue of doing that, accessing directly and no using through the Core API?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me double check this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, good catch @rscampos. Currently this change isn't a problem, but in the future it will cause data races due to the upcoming runtime changes to CoreEvents. That made me wonder why keep the non-changing definitions (syscalls) under lock contention? Would it be feasible to segregate them from the sigs? @yanivagman @NDStrahilevitz WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this brings a significant improvement in CPU time, although it will cause some concurrency issues in the future. I'm in favor of merging it (giving more information in the comment) to be a starting point for discussion about internal event segregation. If that's not feasible, we can easily fall back to the lock-contention way of retrieving the definition.

@rscampos
Copy link
Contributor

rscampos commented Aug 8, 2024

Replicate both benchmarks, those are the results:

parseSyscall function, improve around 11%:

% GOOS=linux CC=clang CGO_CFLAGS="-I/vagrant/dist/libbpf -I/vagrant/3rdparty/libbpf/include/uapi" CGO_LDFLAGS="-L/vagrant/dist/libbpf/obj -lbpf " go test -benchmem -run=^$ -bench '^(Benchmark(_parseSyscallWarm|_parseSyscallOld|_parseSyscall))$' github.com/aquasecurity/tracee/pkg/events -benchtime=1000000x

goos: linux
goarch: amd64
pkg: github.com/aquasecurity/tracee/pkg/events
cpu: Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz
Benchmark_parseSyscallOld-8      1000000             27383 ns/op             896 B/op         31 allocs/op
Benchmark_parseSyscall-8         1000000             24572 ns/op             896 B/op         31 allocs/op
PASS
ok      github.com/aquasecurity/tracee/pkg/events       74.236s

On ParseArgs, in my env., was not able to see a big improve. But the way you refactor the code is more clean.

GOOS=linux CC=clang GOARCH=amd64 CGO_CFLAGS="-I/vagrant/dist/libbpf -I/vagrant/3rdparty/libbpf/include/uapi" CGO_LDFLAGS="-L/vagrant/dist/libbpf/obj -lbpf " go test -benchmem -run=^$ -bench '^(Benchmark(ParseArgsWarm|ParseArgsOld|ParseArgsNew))$'  github.com/aquasecurity/tracee/pkg/events -benchtime=20000000x
goos: linux
goarch: amd64
pkg: github.com/aquasecurity/tracee/pkg/events
cpu: Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz
BenchmarkParseArgsOld-8         20000000              1440 ns/op               0 B/op          0 allocs/op
BenchmarkParseArgsNew-8         20000000              1472 ns/op               0 B/op          0 allocs/op
PASS
ok      github.com/aquasecurity/tracee/pkg/events       81.756s

Regarding the size of the function: reduction of 3455 bytes (22,33%) in the new version but the size of the Tracee binary increase by 736 bytes (probably because you create another file with the helpers function).

New: 1ab4e20      12016 T [github.com/aquasecurity/tracee/pkg/events.ParseArgs](http://github.com/aquasecurity/tracee/pkg/events.ParseArgs)
Old: 1ab5160      15471 T [github.com/aquasecurity/tracee/pkg/events.ParseArgs](http://github.com/aquasecurity/tracee/pkg/events.ParseArgs)

Copy link
Contributor

@rscampos rscampos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@geyslan
Copy link
Member Author

geyslan commented Aug 9, 2024

Regarding the size of the function: reduction of 3455 bytes (22,33%) in the new version but the size of the Tracee binary increase by 736 bytes (probably because you create another file with the helpers function).

Exactly. Tks for bringing that data.

Bypass the lock contention accessing the read-only map directly.
@geyslan geyslan merged commit 6034617 into aquasecurity:main Aug 23, 2024
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants