Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use struct with pid and Go routine addr for Go BPF maps #1182

Merged
merged 18 commits into from
Sep 24, 2024
Merged

Conversation

marctc
Copy link
Contributor

@marctc marctc commented Sep 19, 2024

After merging #1158, we change behaviour for Go tracer to be shared among different processes.
Currently we are using the goroutine address as key to store data in BPF maps.
This PR changes the key to use the goroutine address and PID of the process in order to avoid
colissions when storing data to maps.

Fixes #1159

@codecov-commenter
Copy link

codecov-commenter commented Sep 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.05%. Comparing base (bd8952e) to head (627cff4).

Additional details and impacted files
@@             Coverage Diff             @@
##             main    #1182       +/-   ##
===========================================
+ Coverage   65.50%   81.05%   +15.54%     
===========================================
  Files         135      136        +1     
  Lines       11492    11501        +9     
===========================================
+ Hits         7528     9322     +1794     
+ Misses       3295     1646     -1649     
+ Partials      669      533      -136     
Flag Coverage Δ
integration-test 57.04% <ø> (?)
k8s-integration-test 58.75% <ø> (-0.07%) ⬇️
oats-test 35.81% <ø> (ø)
unittests 53.17% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mariomac
Copy link
Contributor

I'm missing something. Shouldn't be each goroutine address unique in the system? I would have expected that the goroutine pointer contains a unique virtual memory address

Copy link
Contributor

@grcevski grcevski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great Marc! There are few other maps we need to fix, but please let me know if you don't want to use this PR and you want to follow-up with another instead and I'll flip this to Approve.

@@ -32,60 +32,80 @@ char __license[] SEC("license") = "Dual MIT/GPL";
// Then it is retrieved in the return uprobes and used to know the HTTP call duration as well as its
// attributes (method, path, and status code).

typedef struct goroutine_key {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to rename this data structure to something like go_addr_key. Because we'll be using it for more than goroutines. I'll show some examples below. Essentially, we need to replace all maps in the Go support where we use a Go address (or value) straight, to be keyed by the PID too.

@@ -56,21 +56,21 @@ struct {

struct {
__uint(type, BPF_MAP_TYPE_LRU_HASH);
__type(key, void *); // key: goroutine
__type(value, void *); // the transport *
__type(key, goroutine_key_t); // key: goroutine
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to do the same for ongoing_grpc_transports and ongoing_streams.

__type(value, grpc_client_func_invocation_t);
__uint(max_entries, MAX_CONCURRENT_REQUESTS);
} ongoing_grpc_client_requests SEC(".maps");

struct {
__uint(type, BPF_MAP_TYPE_LRU_HASH);
__type(key, void *); // key: pointer to the request goroutine
__type(key, goroutine_key_t); // key: pointer to the request goroutine
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ongoing_grpc_header_writes needs to be fixed up too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for grpc_framer_invocation_map too.

@@ -36,8 +36,8 @@ struct {

struct {
__uint(type, BPF_MAP_TYPE_LRU_HASH);
__type(key, void *); // goroutine
__type(value, topic_t); // topic info
__type(key, goroutine_key_t); // goroutine
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to wrap the key for produce_traceparents and ongoing_produce_messages.

bpf/go_redis.h Outdated
__type(key, void *); // key: goroutine id
__type(value, void *); // the *Conn
__type(key, goroutine_key_t); // key: goroutine id
__type(value, void *); // the *Conn
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to do the same for http2_req_map and framer_invocation_map.

@@ -25,8 +25,8 @@ struct {

struct {
__uint(type, BPF_MAP_TYPE_LRU_HASH);
__type(key, void *); // key: goroutine id
__type(value, u32); // correlation id
__type(key, goroutine_key_t); // key: goroutine id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to do the same to kafka_requests.

@marctc
Copy link
Contributor Author

marctc commented Sep 24, 2024

Looks great Marc! There are few other maps we need to fix, but please let me know if you don't want to use this PR and you want to follow-up with another instead and I'll flip this to Approve.

I see, thanks for the feedback. Yeah, I think it'd be better in a different PR

@grcevski
Copy link
Contributor

grcevski commented Sep 24, 2024

I'm missing something. Shouldn't be each goroutine address unique in the system? I would have expected that the goroutine pointer contains a unique virtual memory address

You are right, but Go can grown and shrink the allocated memory, so theoretically, can't we have stale data from arena of the virtual address space that belonged to process A and then that same memory now is part of process B? Like one process reduces the available heap and frees that page, then another ends up taking it. Since we have a single tracer now, we can potentially access stale data, since a lot of our maps are LRU. I mean it's very unlikely, but it might happen. Same goes for process ending, and new one reclaiming the memory or part of it, e.g restart.

We must do it for the streamID keyed variables, this is a bug anyway even today.

Copy link
Contributor

@grcevski grcevski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@rafaelroquetto
Copy link
Contributor

Please don't merge this yet, it seems to be tracking/comitting binary .o files.

@marctc marctc merged commit fb456f1 into main Sep 24, 2024
12 checks passed
@marctc marctc deleted the use_struct_go_maps branch September 24, 2024 16:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use PIDs + address as key of maps in Go tracer.
5 participants