You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, we use the following steps for retrieving data from Filecoin when we lack a CID in the local cache:
Query the indexer/Estuary
For every result returned, query each individual provider in parallel, but wait for all results to return.
Retrieve sequentially based on a sorting function.
There are a couple ways we can optimize this:
the Filecoin Indexer at minimum should contain information on whether the deal is verified in the results returned. We can use a deal being verified as a proxy for "likely free" and skip the query. It also contains the PieceCID which we could use to optimize on the provider side not having to go look for it.
if we start to get query responses back that meet the sort of best criteria in our sorting function (say anything that's free for example) we could just kick off our first retrieval and sort the remaining responses as they come in.
One other thing to factor in is how we want to abstract the additional data returned by the indexer that doesn't come from estuary (I think). Honestly we should think about this problem in general since for example Estuary can have a different "Root CID" while the index is always the same.
The text was updated successfully, but these errors were encountered:
we could add a field to RetrievalCandidate PossiblyFree bool or something along those lines. since RetrievalCandidate is returned by the indexer impl, it would be no issue to write endpoint-specific behavior. if estuary isn't able to provide the info, it would be as simple as just having the estuary endpoint always set PossiblyFree to false. the indexer endpoint impl would be able to set it properly.
we could immediately just attempt retrievals on all of the PossiblyFree == true candidates with pre-assumed retrieval params, and only if all of those fail, fall back to query + retrieval like what's currently done.
Currently, we use the following steps for retrieving data from Filecoin when we lack a CID in the local cache:
There are a couple ways we can optimize this:
One other thing to factor in is how we want to abstract the additional data returned by the indexer that doesn't come from estuary (I think). Honestly we should think about this problem in general since for example Estuary can have a different "Root CID" while the index is always the same.
The text was updated successfully, but these errors were encountered: