-
Notifications
You must be signed in to change notification settings - Fork 12.4k
Add TrieProof library #5826
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add TrieProof library #5826
Conversation
Co-authored-by: Hadrien Croubois <hadrien.croubois@gmail.com>
Co-authored-by: Arr00 <13561405+arr00@users.noreply.github.com>
Co-authored-by: Ernesto García <ernestognw@gmail.com>
|
I went ahead with this comment and removed the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, but I think we need to discuss one last thing.
There's this PoC that shows it's possible to construct 2 RLP items such that keccak256(rlp.encode(r1)) == bytes32(rlp.encode(r2)) being r1 a long item (length >= 33) and r2 a short item (length < 32). This means that a proof can be forged for r1 if r2 exists and viceversa. Although the attack is very unlikely since it apparently requires a branch node close to the leaf's key, it may justify that we expose a variant of verify called secureVerify that rehashes the key. This is what the SecureMerkleTrie library from optimism does
function secureVerify(
bytes memory value,
bytes32 root,
bytes memory key,
bytes[] memory proof
) internal pure returns (bool) {
return verify(value, root, abi.encodePacked(keccak256(key)), proof);
}As far as I understand, both the verify and secureVerify variants would be useful since the former would serve to verify against a transactionRoot or a receiptsRoot, but storageRoot and stateRoot use the secure variant (i.e. this is why we hash the storage address and the slot in the tests), so it seems worth keeping both.
|
My understanding is that the choice of key is not ours to make. Its really the tree building process that decides what key is used, and the verifyer must follow the same mechanism otherwize they won't be able to generate a proof.
In all cases its the same function that is used, just with different input (that you prepare differently). I'm not sure I see the point in having wrappers that do that preparation. If the preparation is not done correctly, there should not be a proof, and the verification shoud always fail. If having a |
|
To be clear, I initially thought the However, I found some research suggesting that the keys are hashed for storage proofs and account proofs to avoid arbitrary disk I/O, so rehashing doesn't really concern to on-chain verification. Still, for transaction or events proof I understand RLP forgery is theoretically possible, just they keys are too low and sequential.
Yeah I was trying to avoid users making a mistake rather than "preparing" the verification, but on a second thought I agree there's no point if there's no relevant security consideration.
Having specific verify functions may be a good idea but I only see benefit in documenting how to obtain and process each type of proof. For that case I'd rather make a guide in the docs 🤔 |
For the transaction and receipt tries, the keys are just the index in the block (starting at 0). That makes for overall "shorters" (as in not too deep) trees. We know all values from 0 to N are populated, so there is no serious "unbalancing of the tree here" For the storage (within account), the slots are controlled by the "user" (the compiler actually). Its easy to populate many consecutive slot at a pseudo-random location. That is what appens when you write a long string to storage. The hashing of the storage slot randomizes all that in the tree, making it more balanced. For the state tree, I guess the logic is similar to the storage tree, though I don't really understand why. Addresses are hashes already, so I'm not sure what this is preventing. Many its for dealing with consecutive/close addresses that are the result of vanity generation, but an attacker could just as well use a vanity gen that makes the images of addresses "close" to try to unbalance the tree. Still doing the hash makes the state tree have a depth of 32 bytes instead of 20 bytes, which feels counterproductive. Maybe a coredev could explain us why its don't that way.
We don't even know how to generate the proofs for transactions and events. AFAIK the easiest way might be to fetch the entier block (with all txs and events) and rebuild the trees from that. I feel like whoever is going to use this function to prove events likelly knows more about this process than we do, so having mid-documentation (as in documentation that is not perfect) is probably useless. I'd just point to the EIP that stnadardize that (or the yellow paper) |
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
Co-authored-by: Ernesto García <ernestognw@gmail.com>
Co-authored-by: Hadrien Croubois <hadrien.croubois@gmail.com>
Follow up from #5680
PR Checklist
npx changeset add)