Skip to content

Commit

Permalink
disk cache: store a data integrity header for non-CAS blobs
Browse files Browse the repository at this point in the history
The header is made up of three fields:
1) Little-endian int32 (4 bytes) representing the REAPIv2
   DigestFunction.
2) Little-endian int64 (8 bytes) representing the number
   of bytes in the blob.
3) The hash bytes from the digest, length determined by
   the particular DigestFunction.
   (32 for SHA256. 20 for SHA1, 16 for MD5).

Note that we currently only support SHA256, however.

This header is simple to parse, and does not require buffering the
entire blob in memory if you just want the data.

To distinguish blobs with and without this header, we use new
directories for the affected blobs: ac.v2/ instead of ac/ and
similarly for raw/.

We do not use this header to actually verify data yet, and we
still os.File.Sync() after file writes (#67).

This also includes a slightly refactored version of PR #123
(load the items from disk concurrently) by @bdittmer.
  • Loading branch information
mostynb committed Feb 26, 2020
1 parent 1d85a0a commit 594719e
Show file tree
Hide file tree
Showing 5 changed files with 843 additions and 149 deletions.
2 changes: 2 additions & 0 deletions cache/disk/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ go_library(
name = "go_default_library",
srcs = [
"disk.go",
"load.go",
"lru.go",
],
importpath = "github.com/buchgr/bazel-remote/cache/disk",
Expand All @@ -29,5 +30,6 @@ go_test(
"//cache:go_default_library",
"//cache/http:go_default_library",
"//utils:go_default_library",
"@com_github_bazelbuild_remote_apis//build/bazel/remote/execution/v2:go_default_library",
],
)
Loading

0 comments on commit 594719e

Please sign in to comment.