Optimizing Filecoin Retrieval TTFB #102

hannahhoward · 2022-07-19T23:59:06Z

Currently, we use the following steps for retrieving data from Filecoin when we lack a CID in the local cache:

Query the indexer/Estuary
For every result returned, query each individual provider in parallel, but wait for all results to return.
Retrieve sequentially based on a sorting function.

There are a couple ways we can optimize this:

the Filecoin Indexer at minimum should contain information on whether the deal is verified in the results returned. We can use a deal being verified as a proxy for "likely free" and skip the query. It also contains the PieceCID which we could use to optimize on the provider side not having to go look for it.
if we start to get query responses back that meet the sort of best criteria in our sorting function (say anything that's free for example) we could just kick off our first retrieval and sort the remaining responses as they come in.

One other thing to factor in is how we want to abstract the additional data returned by the indexer that doesn't come from estuary (I think). Honestly we should think about this problem in general since for example Estuary can have a different "Root CID" while the index is always the same.

elijaharita · 2022-07-22T14:55:53Z

we could add a field to RetrievalCandidate PossiblyFree bool or something along those lines. since RetrievalCandidate is returned by the indexer impl, it would be no issue to write endpoint-specific behavior. if estuary isn't able to provide the info, it would be as simple as just having the estuary endpoint always set PossiblyFree to false. the indexer endpoint impl would be able to set it properly.

we could immediately just attempt retrievals on all of the PossiblyFree == true candidates with pre-assumed retrieval params, and only if all of those fail, fall back to query + retrieval like what's currently done.

hannahhoward added this to Bedrock Tornado Team OLD Jul 22, 2022

hannahhoward assigned rvagg Jul 22, 2022

hannahhoward moved this to Backlog in Bedrock Tornado Team OLD Jul 22, 2022

hannahhoward moved this from Backlog to In Progress in Bedrock Tornado Team OLD Aug 5, 2022

hannahhoward moved this from In Progress to Backlog in Bedrock Tornado Team OLD Aug 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimizing Filecoin Retrieval TTFB #102

Optimizing Filecoin Retrieval TTFB #102

hannahhoward commented Jul 19, 2022

elijaharita commented Jul 22, 2022

Optimizing Filecoin Retrieval TTFB #102

Optimizing Filecoin Retrieval TTFB #102

Comments

hannahhoward commented Jul 19, 2022

elijaharita commented Jul 22, 2022