Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement search_transactions_before and search_transactions_after #13621

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

caglaryucekaya
Copy link
Contributor

@caglaryucekaya caglaryucekaya commented Jan 2, 2025

Will close #13499

@caglaryucekaya
Copy link
Contributor Author

@mattsse I have some questions. In the issue you wrote that we should enforce a page size limit like 100 blocks. However, according to the documentation the page size argument is not blocks but the number of transactions, so we can put a limit to that.

One problem with my current implementation is that it processes blocks one by one until all the traces are fetched because we don't know in advance in which block we are going to reach the page_size number of transactions. The problem is that since the blocks are not processed in parallel, if the user e.g. request the search beginning from the genesis block and the searched addresses appear far later, the search can take hours.

I will also look into your suggestion to use AccountReader for optimization.

Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, this is great start

I left a few questions because the page settings are a bit confusing to me

page size argument is not blocks but the number of transactions

I see, then we should perhaps try to perform tracing of multiple blocks in parallel by spawning jobs?

if the user e.g. request the search beginning from the genesis block and the searched addresses appear far later, the search can take hours.

yeah we can definitely look into processing block tracing in parallel after we have the initial draft

crates/rpc/rpc/src/otterscan.rs Outdated Show resolved Hide resolved
crates/rpc/rpc/src/otterscan.rs Outdated Show resolved Hide resolved
let mut txs_with_receipts = TransactionsWithReceipts {
txs: Vec::default(),
receipts: Vec::default(),
first_page: false,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unclear what first_page means

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First page means it's the page with the most recent transactions meaning we have traced until the tip of the chain.

crates/rpc/rpc/src/otterscan.rs Outdated Show resolved Hide resolved
Comment on lines +341 to +342
from_block: None,
to_block: None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is where we'd need to configure the block_number and perhaps the page_size?

because worst case this would trace the entire chain

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So do you think we should put a limit to the number of blocks traced, or to the number of transactions? If we limit the number of transaction, there's still the chance to trace the entire chain e.g. if the account has 5 transactions in total and the user requests 10.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately, we don't have another way to determine this so this is will always be possible, so we should doing some chunking instead so we limit how many blocks we trace at once, so something like a https://docs.rs/futures/latest/futures/stream/struct.FuturesUnordered.html with limited capacity where we push new block tasks

Copy link
Contributor Author

@caglaryucekaya caglaryucekaya Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you mean is processing blocks as batches of something like 100 or 1000 right? E.g. processing 1000 blocks, waiting until they're all complete, and continue with another 1000 blocks if we didn't reach the page size yet. In that case using try_join_all looks like a better idea since we have to wait for all 1000 blocks to complete anyways. It's also how it's done in trace_filter.

Copy link
Contributor Author

@caglaryucekaya caglaryucekaya Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, FuturesUnordered will be useful

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I changed the implementation with FuturesUnordered and it works much faster now. I used 1000 block batches for now but we can change that. I think we can also put a limit to page_size to prevent requesting unreasonable number of transactions. What do you think would be a good number for that?

@github-actions github-actions bot added the S-stale This issue/PR is stale and will close with no further activity label Jan 31, 2025
@github-actions github-actions bot closed this Feb 7, 2025
@mattsse mattsse added C-enhancement New feature or request A-rpc Related to the RPC implementation M-prevent-stale Prevents old inactive issues/PRs from being closed due to inactivity and removed S-stale This issue/PR is stale and will close with no further activity labels Feb 7, 2025
@mattsse mattsse reopened this Feb 7, 2025
@sealer3
Copy link

sealer3 commented Feb 11, 2025

@mattsse I have some questions. In the issue you wrote that we should enforce a page size limit like 100 blocks. However, according to the documentation the page size argument is not blocks but the number of transactions, so we can put a limit to that.

One problem with my current implementation is that it processes blocks one by one until all the traces are fetched because we don't know in advance in which block we are going to reach the page_size number of transactions. The problem is that since the blocks are not processed in parallel, if the user e.g. request the search beginning from the genesis block and the searched addresses appear far later, the search can take hours.

I will also look into your suggestion to use AccountReader for optimization.

Note the gotcha from the Otterscan RPC API docs on this one:

There is a small gotcha regarding pageSize. If there are fewer results than pageSize, they are just returned as is.

But if there are more than pageSize results, they are capped by the last found block. For example, let's say you are searching for Uniswap Router address with a pageSize of 25, and it already found 24 matches. It then looks at the next block containing this address's occurrences and there are 5 matches inside the block. They are all returned, so it returns 30 transaction results. The caller code should be aware of this.

And feel free to reference the Anvil implementation, which was correct (at least the last time I went through the code): https://github.com/foundry-rs/foundry/blob/master/crates/anvil/src/eth/otterscan/api.rs

The default Otterscan page size is 25, so make sure not to set a limit below that.

@caglaryucekaya
Copy link
Contributor Author

Note the gotcha from the Otterscan RPC API docs on this one:

I was aware of the gotcha and implemented accordingly.

And feel free to reference the Anvil implementation, which was correct (at least the last time I went through the code): https://github.com/foundry-rs/foundry/blob/master/crates/anvil/src/eth/otterscan/api.rs

The default Otterscan page size is 25, so make sure not to set a limit below that.

I wasn't aware of the Foundry implementation, I will check it and make changes if necessary. And I will use a limit above 25 transactions. Thank you very much!

@caglaryucekaya
Copy link
Contributor Author

@sealer3 Do you know why the Anvil implementation only considers post-fork blocks?

…plement-search-transactions-before-and-after
@sealer3
Copy link

sealer3 commented Feb 13, 2025

According to the original pull request foundry-rs/foundry#5414, the stated reason is to be able to iterate through all blocks in memory for ots_searchTransactionsBefore, ots_searchTransactionsAfter, ots_getContractCreator, and other Otterscan RPC methods that would otherwise require special indexing from the node.

Since Anvil is designed to work on any backend node (for forking), it would make sense for them not to have to make potentially thousands of RPC requests to a backend node that might be far away on a remote machine to gather the required information for a response.

@caglaryucekaya
Copy link
Contributor Author

@mattsse I also added the implementation for search_transactions_before. This is waiting for your review now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-rpc Related to the RPC implementation C-enhancement New feature or request M-prevent-stale Prevents old inactive issues/PRs from being closed due to inactivity
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement searchTransactionsBefore and searchTransactionsAfter
3 participants