Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

safekeeper: fan out from single wal reader to multiple shards #10190

Merged
merged 25 commits into from
Jan 15, 2025

Conversation

VladLazar
Copy link
Contributor

@VladLazar VladLazar commented Dec 18, 2024

Problem

Safekeepers currently decode and interpret WAL for each shard separately.
This is wasteful in terms of CPU memory usage - we've seen this in profiles.

Summary of changes

Fan-out interpreted WAL to multiple shards.
The basic is that wal decoding and interpretation happens in a separate tokio task and senders
attach to it. Senders only receive batches concerning their shard and only past the Lsn they've last seen.

Fan-out is gated behind the wal_reader_fanout safekeeper flag (disabled by default for now).

When fan-out is enabled, it might be desirable to control the absolute delta between the
current position and a new shard's desired position (i.e. how far behind or ahead a shard may be).
max_delta_for_fanout is a new optional safekeeper flag which dictates whether to create a new
WAL reader or attach to the existing one. By default, this behaviour is disabled. Let's consider enabling
it if we spot the need for it in the field.

Testing

Tests passed here with wal reader fanout enabled
as of 34f6a71.

Related: #9337
Epic: #9329

@VladLazar VladLazar force-pushed the vlad/sk-fanout branch 3 times, most recently from 0ddecad to fb0c47c Compare December 19, 2024 12:20
@VladLazar VladLazar changed the title Vlad/sk fanout safekeeper: fan out from single wal reader to multiple shards Dec 19, 2024
Copy link

github-actions bot commented Dec 19, 2024

7304 tests run: 6933 passed, 0 failed, 371 skipped (full report)


Flaky tests (4)

Postgres 17

Postgres 16

  • test_physical_replication_config_mismatch_max_locks_per_transaction: release-arm64

Postgres 15

Code coverage* (full report)

  • functions: 33.8% (8418 of 24906 functions)
  • lines: 49.2% (70367 of 142895 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
98b57a5 at 2025-01-15T12:19:37.235Z :recycle:

@VladLazar VladLazar force-pushed the vlad/sk-fanout branch 3 times, most recently from 34377a1 to f9956dc Compare January 7, 2025 18:25
@VladLazar VladLazar changed the base branch from main to vlad/fan-out-wal-decoder January 7, 2025 18:26
@VladLazar VladLazar force-pushed the vlad/sk-fanout branch 2 times, most recently from ad551e6 to 34f6a71 Compare January 8, 2025 18:33
@VladLazar VladLazar marked this pull request as ready for review January 8, 2025 18:36
@VladLazar VladLazar requested a review from a team as a code owner January 8, 2025 18:36
@VladLazar VladLazar requested review from erikgrinaker and removed request for a team January 8, 2025 18:36
Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this came out a lot simpler than I had feared.

LGTM, except for the shutdown/cancellation logic which I haven't reviewed in detail. Does it need more work, re: TODOs, or is this good to go?

safekeeper/src/metrics.rs Outdated Show resolved Hide resolved
safekeeper/src/test_utils.rs Show resolved Hide resolved
safekeeper/src/send_wal.rs Show resolved Hide resolved
safekeeper/src/send_wal.rs Show resolved Hide resolved
safekeeper/src/send_wal.rs Outdated Show resolved Hide resolved
safekeeper/src/wal_reader_stream.rs Outdated Show resolved Hide resolved
safekeeper/src/send_interpreted_wal.rs Show resolved Hide resolved
safekeeper/src/send_interpreted_wal.rs Show resolved Hide resolved
safekeeper/src/send_interpreted_wal.rs Show resolved Hide resolved
safekeeper/src/send_interpreted_wal.rs Outdated Show resolved Hide resolved
Base automatically changed from vlad/fan-out-wal-decoder to main January 15, 2025 11:11
@github-actions github-actions bot requested a review from a team as a code owner January 15, 2025 11:11
Each protocol gets its version of wal sender state.
Future commits will add internal state here, so define an internal
version of the type to be used within the safekeeper crate.
We want to suppport resetting the wal stream at arbitrary wal
positions. This commit reworks the existing streaming wal reader
to suppport this functionality.
The basic idea for wal reader fanout is that wal decoding and
interpretation happens in a separate tokio task and senders
attach to it. Senders only receive batches concerning their shard
and only past the Lsn they've last seen.

When the wal reading task is spawned, it returns a reference through
which it can be controlled. This reference is shared by multiple wal
senders. When all wal senders go out of scope, the reference goes out of
scope and the task is aborted.
The same shard may be attached to two or more different pageservers,
so the fan out reader needs to deal with this scenario too.

Add a `SenderId` to differentiate between senders belonging to the
same shard. Doesn't change the code flow much, but adds some extra
complexity.
@VladLazar VladLazar added this pull request to the merge queue Jan 15, 2025
Merged via the queue into main with commit dbebede Jan 15, 2025
88 checks passed
@VladLazar VladLazar deleted the vlad/sk-fanout branch January 15, 2025 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants