Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(iroh): don't hit network for complete blobs&collections #2576

Closed
wants to merge 7 commits into from

Conversation

dignifiedquire
Copy link
Contributor

@dignifiedquire dignifiedquire commented Aug 2, 2024

Description

Two changes to the downloader:

  • Never try to download from ourselves. If the only provider node added is our own node, fail with error "no providers".
  • Check if we have the blob already available locally before starting to open any connections

The second point turned out to be a bit more complicated:

  • For hashseqs (collections) we have to check that we have all children completely available to not start a download
  • We should still emit the FoundLocal events so that the state tracking at the client works the same no matter if an actual download was performed or not.

Both these things are now implemented, through a new function check_local_with_progress_if_complete in iroh_blobs that is called before starting the download. It checks if the blob is fully available locally (and all children too for hashseqs) and if so returns true, and emits the FoundLocal events. No events are emitted if the blob is not completely available, as in that case we start the actual download machinery, which will emit the FoundLocal events too on the go.

Adds three tests:

  • Downloading a missing blob from the local node fails without trying to connect to ourselves
  • Downloading an existing blob succeeds without trying to download
  • Downloading an existing collection succeeds without trying to download

Closes #2575

Notes and open questions

The API around the new function and the exisitng get_to_db could be improved, and we could skip the double-checking, but that would be a more involved change to a public API that I didn't want to make.

Breaking changes

None, only an API addition to the public API of iroh_blobs: iroh_blobs::get::check_local_with_progress_if_complete

@dignifiedquire dignifiedquire added this to the v0.22.0 milestone Aug 2, 2024
Copy link

github-actions bot commented Aug 2, 2024

Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/2576/docs/iroh/

Last updated: 2024-08-02T14:38:15Z

…nd do not attempt to download from ourselves
@Frando Frando force-pushed the fix-download-self branch from 22d5acd to 8845025 Compare August 2, 2024 10:03
@dignifiedquire dignifiedquire marked this pull request as ready for review August 2, 2024 10:16
@dignifiedquire
Copy link
Contributor Author

lgtm, can't approve my own pr 😅

@dignifiedquire dignifiedquire changed the title [WIP] fix(iroh): downloading queued from oneself fix(iroh): downloading queued from oneself Aug 2, 2024
iroh/src/client/blobs.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@matheus23 matheus23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I have limited experience in this part of the code.

One note is that we should document the additions as breaking changes in the Dialer and Getter traits.

@Frando Frando force-pushed the fix-download-self branch from 527bb93 to cc9fa77 Compare August 2, 2024 14:28
@Frando
Copy link
Member

Frando commented Aug 2, 2024

As discussed in Discord, the previous fix was incorrect as it would report a succeeding download for collections if the root blob is complete even if we had no or only some of the children available locally.

I added two commits that fix this:

  • We now check if a blob is complete (with all children for hashseqs) before starting a download. We still emit the expected FoundLocal events in that case.
  • A test for downloading a collection which exists locally, succeeding without hitting the network

Also updated the PR description.

Please re-review :)

@Frando Frando force-pushed the fix-download-self branch from cc9fa77 to ccf15d4 Compare August 2, 2024 14:32
@Frando Frando requested a review from matheus23 August 2, 2024 14:34
@Frando Frando changed the title fix(iroh): downloading queued from oneself fix(iroh): don't hit network for complete blobs&collections Aug 2, 2024

let mut children: Vec<Hash> = vec![];
while let Some(hash) = hash_seq.next().await? {
children.push(hash);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why collect first, instead of directly doing the db checks?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because with the PR as-is, we may only emit events here if everything is complete. Otherwise the same events would be emitted again when actually starting the download.

This is suboptimal for sure. See #2586 for a revised approach that does not need to collect events by using a generator and turning get_to_db into a two-step process.

},
)
.await?
.await
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably check the events in these tests, to make sure they look like we expect them, especially in the collection case

@dignifiedquire
Copy link
Contributor Author

closing in favor of #2586

github-merge-queue bot pushed a commit that referenced this pull request Aug 5, 2024
…are complete (#2586)

## Description

Two changes to the downloader:

* Never try to download from ourselves. If the only provider node added
is our own node, fail with error "no providers".
* The actual download request flow is turned into a generator (while
keeping API compatibility for the existing `get_to_db` public function).
A new `get_to_db_in_steps` function either runs to completion if the
requested data is fully available locally, or yields a `NeedsConn`
struct at the point where it needs a network connection to proceed. The
`NeedsConn` has an `async proceed(self, conn: Connection)`, which must
be called with a connection for the actual download to start.

This two-step process allows the downloader to check if we should dial
nodes at all, or are already done without doing anything, while emitting
the exact same flow of events (because we run the same loop) to the
client.

To achieve this, `get_to_db` now uses a genawaiter generator internally.
This means that the big loop that is the iroh-blobs protocol request
flow does not have to be changed at all, only that instead of a closure
we yield and resume, which makes this much easier to integrate into an
external state machine like the downloader.

The changes needed for this for the downloader are a bit verbose because
the downloader itself is generic over a `Getter`, with impls for the
actual impl and a test impl that does not use networking; therefore the
new `NeedsConn` state has to be modeled with an additional associated
type and trait here.

This PR adds three tests:

* Downloading a missing blob from the local node fails without trying to
connect to ourselves
* Downloading an existing blob succeeds without trying to download
* Downloading an existing collection succeeds without trying to download

Closes #2575
Replaced #2576

## Notes and open questions



## Breaking changes

None, only an API addition to the public API of iroh_blobs:
`iroh_blobs::get::check_local_with_progress_if_complete`

---------

Co-authored-by: dignifiedquire <me@dignifiedquire.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Queued blob download from oneself fails
3 participants