New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Multilevel cache #1064

Merged

vladem merged 2 commits into awslabs:main from vladem:multilevel-cache

Nov 6, 2024

Contributor

vladem commented Oct 15, 2024 •

edited

Loading

Description of change

Allow using both caches when --cache-express <bucket> --cache <directory> options are specified, local cache is queried first.

Relevant issues: No

Does this change impact existing behavior?

No.

Does this change need a changelog entry in any of the crates?

Yes, will add in one of the future PRs.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the Developer Certificate of Origin (DCO).

vladem temporarily deployed to PR integration tests

October 15, 2024 14:22

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

October 15, 2024 14:22

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

October 15, 2024 14:22

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

October 15, 2024 14:22

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

October 15, 2024 14:22

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

October 15, 2024 14:22

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

October 15, 2024 14:22

— with

GitHub Actions Inactive

vladem commented

View reviewed changes

mountpoint-s3/src/data_cache/express_data_cache.rs Outdated Show resolved Hide resolved

vladem requested review from passaro and muddyfish

October 15, 2024 15:05

muddyfish reviewed

View reviewed changes

mountpoint-s3-client/src/mock_client.rs Outdated Show resolved Hide resolved

mountpoint-s3/src/cli.rs Outdated Show resolved Hide resolved

mountpoint-s3/src/cli.rs Outdated Show resolved Hide resolved

mountpoint-s3/src/cli.rs Outdated Show resolved Hide resolved

mountpoint-s3/src/cli.rs Outdated Show resolved Hide resolved

mountpoint-s3/src/data_cache/express_data_cache.rs Outdated Show resolved Hide resolved

mountpoint-s3/src/data_cache/multilevel_cache.rs Outdated Show resolved Hide resolved

mountpoint-s3/src/data_cache/multilevel_cache.rs Outdated Show resolved Hide resolved

mountpoint-s3/src/data_cache/multilevel_cache.rs Outdated Show resolved Hide resolved

mountpoint-s3/src/data_cache/multilevel_cache.rs Show resolved Hide resolved

passaro reviewed

View reviewed changes

mountpoint-s3-client/src/mock_client.rs Show resolved Hide resolved

mountpoint-s3/src/cli.rs Outdated Show resolved Hide resolved

mountpoint-s3/src/cli.rs Outdated Show resolved Hide resolved

mountpoint-s3/src/data_cache/express_data_cache.rs Outdated Show resolved Hide resolved

mountpoint-s3/src/data_cache/express_data_cache.rs Outdated Show resolved Hide resolved

mountpoint-s3/src/data_cache/multilevel_cache.rs Outdated Show resolved Hide resolved

mountpoint-s3/src/data_cache/multilevel_cache.rs Outdated Show resolved Hide resolved

mountpoint-s3/src/data_cache/multilevel_cache.rs Outdated Show resolved Hide resolved

mountpoint-s3/src/data_cache/multilevel_cache.rs Show resolved Hide resolved

vladem force-pushed the multilevel-cache branch from 8d9efd6 to 6a8e202 Compare

October 31, 2024 15:56

vladem temporarily deployed to PR integration tests

October 31, 2024 15:56

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

October 31, 2024 15:56

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

October 31, 2024 15:56

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

October 31, 2024 15:56

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

October 31, 2024 15:56

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

October 31, 2024 15:56

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

October 31, 2024 15:56

— with

GitHub Actions Inactive

vladem requested review from passaro and muddyfish

October 31, 2024 16:11

muddyfish reviewed

View reviewed changes

mountpoint-s3/src/data_cache/multilevel_cache.rs Outdated Show resolved Hide resolved

mountpoint-s3/src/data_cache/multilevel_cache.rs Outdated Show resolved Hide resolved

mountpoint-s3/src/data_cache/multilevel_cache.rs Show resolved Hide resolved

mountpoint-s3/src/data_cache/multilevel_cache.rs Outdated Show resolved Hide resolved

passaro reviewed

View reviewed changes

mountpoint-s3/src/cli.rs Outdated Show resolved Hide resolved

mountpoint-s3/src/data_cache/multilevel_cache.rs Outdated Show resolved Hide resolved

mountpoint-s3/src/data_cache/multilevel_cache.rs Show resolved Hide resolved

mountpoint-s3/src/data_cache/multilevel_cache.rs Outdated Show resolved Hide resolved

mountpoint-s3/src/data_cache/multilevel_cache.rs Outdated Show resolved Hide resolved

mountpoint-s3/src/data_cache/multilevel_cache.rs Outdated Show resolved Hide resolved

mountpoint-s3/src/data_cache/multilevel_cache.rs Show resolved Hide resolved

vladem temporarily deployed to PR integration tests

November 1, 2024 18:03

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

November 1, 2024 18:03

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

November 1, 2024 18:03

— with

GitHub Actions Inactive

vladem had a problem deploying to PR integration tests

November 1, 2024 18:03

— with

GitHub Actions Failure

vladem temporarily deployed to PR integration tests

November 1, 2024 18:03

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

November 1, 2024 18:03

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

November 1, 2024 18:03

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

November 3, 2024 16:45

— with

GitHub Actions Inactive

vladem requested review from passaro and muddyfish

November 3, 2024 17:09

passaro reviewed

View reviewed changes

Contributor

passaro left a comment

Just a couple of comments.

mountpoint-s3/src/cli.rs Outdated Show resolved Hide resolved

mountpoint-s3/src/cli.rs Show resolved Hide resolved

mountpoint-s3/src/data_cache/express_data_cache.rs Outdated Show resolved Hide resolved

mountpoint-s3/src/data_cache/multilevel_cache.rs Show resolved Hide resolved


          Multilevel cache

a232279

Signed-off-by: Vlad Volodkin <vlaad@amazon.com>

vladem force-pushed the multilevel-cache branch from 69d9063 to a232279 Compare

November 5, 2024 17:27

vladem temporarily deployed to PR integration tests

November 5, 2024 17:27

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

November 5, 2024 17:28

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

November 5, 2024 17:28

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

November 5, 2024 17:28

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

November 5, 2024 17:28

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

November 5, 2024 17:28

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

November 5, 2024 17:28

— with

GitHub Actions Inactive

vladem requested a review from passaro

November 5, 2024 19:24

passaro reviewed

View reviewed changes

mountpoint-s3/src/data_cache/multilevel_cache.rs Show resolved Hide resolved

mountpoint-s3/src/data_cache/multilevel_cache.rs Outdated

+                  MultilevelDataCache<DiskCache, ExpressCache, Runtime>
+              {
+                  pub fn new(disk_cache: Arc<DiskCache>, express_cache: ExpressCache, runtime: Runtime) -> Self {
+                      // Method `MultilevelDataCache::block_size` relies on block sizes of both caches to be equal.

Contributor

passaro Nov 6, 2024

A bit convoluted. We use the same blocks at both levels, so they need to have the same size. I'd mention it as a requirement in this method rustdoc.

mountpoint-s3/src/cli.rs Outdated

@@ @@ -298,10 +309,10 @@ pub struct CliArgs { @@
                   #[cfg(feature = "block_size")]
                   #[clap(
                       long,
-                      help = "Size of a cache block in KiB [Default: 1024 (1 MiB) for disk cache, 512 (512 KiB) for S3 Express cache]",
+                      help = "Size of a cache block in KiB [Default: 1024 (1 MiB) for disk cache and for S3 Express cache]",

Contributor

passaro Nov 6, 2024

Suggested change

      
                    help = "Size of a cache block in KiB [Default: 1024 (1 MiB) for disk cache and for S3 Express cache]",
          
                    help = "Size of a cache block in KiB [Default: 1024 (1 MiB)]",

mountpoint-s3/src/data_cache/express_data_cache.rs Show resolved Hide resolved


          Comments

bc1fa29

Signed-off-by: Vlad Volodkin <vlaad@amazon.com>

vladem temporarily deployed to PR integration tests

November 6, 2024 13:53

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

November 6, 2024 13:53

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

November 6, 2024 13:53

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

November 6, 2024 13:53

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

November 6, 2024 13:53

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

November 6, 2024 13:53

— with

GitHub Actions Inactive

vladem temporarily deployed to PR integration tests

November 6, 2024 13:53

— with

GitHub Actions Inactive

vladem requested a review from passaro

November 6, 2024 14:38

passaro approved these changes

View reviewed changes

Contributor

passaro left a comment

LGTM

mountpoint-s3/src/data_cache/express_data_cache.rs

@@ @@ -89,6 +91,7 @@ where @@
                                   }
                                   buffer.extend_from_slice(&body);
+                                  // Ensure the flow-control window is large enough.
                                   result.as_mut().increment_read_window(self.block_size as usize);

Contributor

passaro Nov 6, 2024

This seems unnecessary now, doesn't it?

Contributor

passaro Nov 6, 2024

Happy to keep it for now and review when we optimize for the single chunk case (see TODO above).

Contributor Author

vladem Nov 6, 2024

We'll need to account for a case when the block object is larger than block_size for some reason. If we've just removed this line the read may freeze. If we've kept it as it is now MP may attempt to read an unbounded amount of data to RAM.

This requires a bit more thinking, so I agree that it's better to address in the following PR.

vladem added this pull request to the merge queue

Merged via the queue into awslabs:main with commit 53197c9

23 checks passed

vladem deleted the multilevel-cache branch

November 6, 2024 17:17

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet