Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

safekeeper: decode and interpret for multiple shards in one go #10201

Merged
merged 19 commits into from
Jan 15, 2025

Conversation

VladLazar
Copy link
Contributor

Problem

Currently, we call InterpretedWalRecord::from_bytes_filtered
from each shard. To serve multiple shards at the same time,
the API needs to allow for enquiring about multiple shards.

Summary of changes

This commit tweaks it a pretty brute force way. Naively, we could
just generate the shard for a key, but pre and post split shards
may be subscribed at the same time, so doing it efficiently is more
complex.

Copy link

github-actions bot commented Dec 19, 2024

7293 tests run: 6928 passed, 0 failed, 365 skipped (full report)


Code coverage* (full report)

  • functions: 32.7% (8048 of 24644 functions)
  • lines: 47.7% (66877 of 140119 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
4512aaa at 2025-01-13T13:59:11.390Z :recycle:

@VladLazar VladLazar marked this pull request as ready for review December 19, 2024 13:39
@VladLazar VladLazar requested review from a team as code owners December 19, 2024 13:39
Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're adding a fair number of heap allocations here. I'll leave it to you to decide whether we need to optimize these now or leave it for later (benchmarks would be good).

libs/wal_decoder/src/decoder.rs Outdated Show resolved Hide resolved
libs/wal_decoder/src/serialized_batch.rs Outdated Show resolved Hide resolved
libs/wal_decoder/src/decoder.rs Outdated Show resolved Hide resolved
libs/wal_decoder/src/serialized_batch.rs Outdated Show resolved Hide resolved
pageserver/src/import_datadir.rs Outdated Show resolved Hide resolved
@VladLazar
Copy link
Contributor Author

We're adding a fair number of heap allocations here. I'll leave it to you to decide whether we need to optimize these now or leave it for later (benchmarks would be good).

Yeah, I felt quite naughty writing this stuff. I'll dust up my benchmark for this stuff to see how much we need to optimise here.

@VladLazar VladLazar removed the request for review from lubennikovaav December 19, 2024 16:30
@VladLazar VladLazar force-pushed the vlad/fan-out-wal-decoder branch from 8a04783 to 5fbec70 Compare December 20, 2024 12:59
@VladLazar VladLazar requested a review from erikgrinaker January 6, 2025 11:41
Currently, we call `InterpretedWalRecord::from_bytes_filtered`
from each shard. To serve multiple shards at the same time,
the API needs to allow for enquiring about multiple shards.

This commit tweaks it a pretty brute force way. Naively, we could
just generate the shard for a key, but pre and post split shards
may be subscribed at the same time, so doing it efficiently is more
complex.
It cannot be changed dynamically. For the stripe size to change,
the shard count needs to change too.
decode-interpret-wal/unsharded
                        time:   [439.01 ms 439.23 ms 439.48 ms]
                        thrpt:  [291.25 MiB/s 291.42 MiB/s 291.56 MiB/s]
decode-interpret-wal/8/8-shards
                        time:   [934.73 ms 935.45 ms 936.41 ms]
                        thrpt:  [136.69 MiB/s 136.83 MiB/s 136.94 MiB/s]
decode-interpret-wal/4/8-shards
                        time:   [646.55 ms 646.72 ms 646.89 ms]
                        thrpt:  [197.87 MiB/s 197.92 MiB/s 197.97 MiB/s]
decode-interpret-wal/2/8-shards
                        time:   [487.38 ms 487.56 ms 487.76 ms]
                        thrpt:  [262.43 MiB/s 262.53 MiB/s 262.63 MiB/s]
@VladLazar VladLazar force-pushed the vlad/fan-out-wal-decoder branch from 22d6e91 to 8e96693 Compare January 7, 2025 18:18
Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just nits, take them or leave them.

libs/pageserver_api/src/shard.rs Outdated Show resolved Hide resolved
libs/pageserver_api/src/shard.rs Outdated Show resolved Hide resolved
libs/wal_decoder/src/decoder.rs Outdated Show resolved Hide resolved
libs/wal_decoder/src/decoder.rs Outdated Show resolved Hide resolved
libs/wal_decoder/src/decoder.rs Show resolved Hide resolved
libs/wal_decoder/src/decoder.rs Outdated Show resolved Hide resolved
libs/wal_decoder/src/serialized_batch.rs Show resolved Hide resolved
libs/wal_decoder/src/serialized_batch.rs Outdated Show resolved Hide resolved
libs/wal_decoder/Cargo.toml Outdated Show resolved Hide resolved
@VladLazar
Copy link
Contributor Author

Doing a test run with the changes from #10190 here before merging.

@VladLazar VladLazar added this pull request to the merge queue Jan 15, 2025
Merged via the queue into main with commit 1577430 Jan 15, 2025
84 checks passed
@VladLazar VladLazar deleted the vlad/fan-out-wal-decoder branch January 15, 2025 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants