Implement optional persistent registry lookup cache and better 429 handling #13

tianon · 2024-01-11T23:12:56Z

This adds a --cache flag to builds which will read from a cache file and write back out to it a set of images that are ~safe to cache (and because it's just a file, we can clear cache entries trivially by deleting the relevant entries from the file).

This also adds better 429 behavior that's twofold:

cap our maximum registry requests at ~500/min with an initial burst of 100 unless we hit a 429 (which should, in theory, keep us closer to staying under the Hub abuse rate limits)
actually retry requests when they give us a 429 response (capping both new requests and retries at the same ~500/min which is ~8/sec)

As a final bonus, I added/generated an appropriate cache file for our local test suite which brings the total number of actual Hub/registry requests our test suite makes down to one single image lookup.

…ndling This adds a `--cache` flag to builds which will read from a cache file and write back out to it a set of images that are ~safe to cache (and because it's just a file, we can clear cache entries trivially by deleting the relevant entries from the file). This also adds better 429 behavior that's twofold: 1. cap our maximum registry requests at ~500/min with an initial burst of 100 unless we hit a 429 (which should, in theory, keep us closer to staying under the Hub abuse rate limits) 2. actually retry requests when they give us a 429 response (capping both new requests and retries at the same ~500/min which is ~8/sec) As a final bonus, I added/generated an appropriate cache file for our local test suite which brings the total number of actual Hub/registry requests our test suite makes down to one single image lookup.

whalelines · 2024-01-12T21:40:01Z

builds.go

 )

-var concurrency = 50
+var concurrency = 1000


How does this interact with the rate limiter?

This was implemented initially as a poor-man's rate limiter, so now that we have a "real" rate limiter in place, we could technically remove this completely and let it be completely unbounded, but I figured making it high (instead of unlimited) was probably better/safer.

In practice the rate limiter is likely to block many of these goroutines anyhow (and having them wait to make a request vs wait to start seems totally fair since the overhead is really low and they can potentially do other things in the mean time). Also, with the cache implemented, many of the goroutines won't block at all, so it's just "free" concurrency. 👍

whalelines · 2024-01-12T21:43:56Z

builds.go

+	saveCacheMutex.Lock()
+	defer saveCacheMutex.Unlock()
+
+	if saveCache == nil || cacheFile == "" {


Could this be moved before locking the mutex?

The cacheFile test could, but the test for saveCache cannot because we cannot safely compare that value without a mutex (even a simple comparison like nil).

(This is also effectively the test for "did the user specify --cache", and we're specifying --cache in all the places we care about now, so this test is effectively as good as a no-op now.)

whalelines · 2024-01-12T21:52:22Z

builds.go

+	rArches := cache.Arches
+
+	if !wasCached {
+		fmt.Fprintf(os.Stderr, "NOTE: lookup %s -> %s\n", img, r.Desc.Digest)


Is this a debugging statement or indented to remain?

It's a bit of both -- I added it for debugging, but realized it was useful enough to know when lookups actually happen that I left it in. 😅

tianon requested a review from yosifkit as a code owner January 11, 2024 23:12

yosifkit approved these changes Jan 12, 2024

View reviewed changes

yosifkit merged commit c16b47a into docker-library:main Jan 12, 2024
1 check passed

yosifkit deleted the 429 branch January 12, 2024 00:03

whalelines reviewed Jan 12, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement optional persistent registry lookup cache and better 429 handling #13

Implement optional persistent registry lookup cache and better 429 handling #13

tianon commented Jan 11, 2024

whalelines Jan 12, 2024

tianon Jan 12, 2024

whalelines Jan 12, 2024

tianon Jan 12, 2024

whalelines Jan 12, 2024

tianon Jan 12, 2024

Implement optional persistent registry lookup cache and better 429 handling #13

Implement optional persistent registry lookup cache and better 429 handling #13

Conversation

tianon commented Jan 11, 2024

whalelines Jan 12, 2024

Choose a reason for hiding this comment

tianon Jan 12, 2024

Choose a reason for hiding this comment

whalelines Jan 12, 2024

Choose a reason for hiding this comment

tianon Jan 12, 2024

Choose a reason for hiding this comment

whalelines Jan 12, 2024

Choose a reason for hiding this comment

tianon Jan 12, 2024

Choose a reason for hiding this comment