From d1110f735f2ab6cf5454d37092e2d60a5af9b561 Mon Sep 17 00:00:00 2001 From: entreprenerd Date: Tue, 13 May 2025 10:21:41 +0200 Subject: [PATCH] feat: asm and testing --- src/evm/assembly_contracts.md | 203 ++++++++++++++++++++++++++++++++++ src/testing/initial.MD | 42 +++++++ 2 files changed, 245 insertions(+) create mode 100644 src/evm/assembly_contracts.md create mode 100644 src/testing/initial.MD diff --git a/src/evm/assembly_contracts.md b/src/evm/assembly_contracts.md new file mode 100644 index 0000000..f6a06b0 --- /dev/null +++ b/src/evm/assembly_contracts.md @@ -0,0 +1,203 @@ +# Writing pure assembly smart contract + +Solidity is great + +Yul with solidity is pretty straightforward + +However, in order to advance our reverse engineering capabilities we need to go straight to the source + +Writing assembly is not particularly difficult per-se + +However, writing, debugging, deploying and testing pure assembly smart contracts is a daunting task + +These are notes I'm taking as I'm learning about the underlying functionality of the EVM as a means to help us write beter + +# Best Tools + +As of today the best tool available is evm.codes + +In the future we may release a better tool, which preserves comments, and replays execution, allowing for a more debuggable experience + +Fundamental issues with evm.codes +- Inability to preview deployment state -> Harder to debug that +- Inability to test calls -> Harder to debug dispatch and functions +- Inability to preserve comements when converting between bytecode and mnemonic + +# Rough Idea + +Writing a contract requires: +- Writing the contract logic +- Initializing and returning the data + +## Example Contract + +``` +// Set the state +PUSH5 0x47616c6c6f +PUSH1 0x00 +MSTORE +PUSH1 0x20 +PUSH1 0x00 +RETURN +``` + +## Initialization Code Pattern + +``` +// Correct initialization code pattern +PUSH1 // Size of runtime code +PUSH1 // Offset where runtime code begins in the full bytecode +PUSH1 // Destination offset in memory (0) +CODECOPY // Copy runtime code to memory + +PUSH1 // Size for return +PUSH1 // Offset for return +RETURN // Return bytes from memory as the runtime code +``` + + +## On Writing assembly + +Annoying issues: +- You have to pass arguments in reverse +- Everything is a unsigned value +- Something about padding / encoding, which I don't really know about | This is prob a massive gotcha waiting to happen + +--- + +## Writing our fist Contract + +Let's write a contract that returns something + +The key opcode we want to target is `RETURN` + +This opcode requires 2 parameters, the offset and the size + +We'd write the operation in Yul in this way: + +```solidity + return(offset, size) +``` + +The EVM is a stack machine, it will pop items off of the stack as it uses them + +The items are passed to various opcodes as parameters + +The gotcha is that because the EVM is stack machine, the item of depth 1 will be viewed as the first item for the opcode + +Meaning we have to pass arguments in reverse order + + +We can start by getting some random string, let's say "Recon" + +We can convert it to UTF-8 and get some bytes +52 65 63 6F 6E + +That's 5 bytes, and will result in the following Mnemonic + +0x5265636F6E + +If we instead wanted to work with direct bytecode we'd remove the 0x as bytecode is implicitly written in hex + +There's probably some additional set of conversions that are done by default by our OS (Since the characters 01 don't actually mean 01 in binary), I'm going to ignore this + +But this is a good reason to test contract as the wrong encoding can result in bugs even if you follow the first-principles + +As discussed we need to return(offset, size) + +This means we have to put the value `0x5265636F6E` into memory first + +Let's do that + +### Putting Recon into Memory + +[TODO] I'm not fully clear as to how memory works in the EVM as of now + +My understanding is Memory is a sequential store of data + +In which we can store one word (32 bytes) at a time + +So even though we're only wanting to store 5 bytes, we'll have to use a full word + + +The code we want to write looks like this: + +``` +mstore(0, 0x5265636F6E) ## mstore(0, "RECON") +``` + +We have to reverse the order so: +Push To Stack 0x5265636F6E +Push to Stack 0x +MSTORE + +We can write it as follows: +``` +PUSH5 0x5265636F6E +PUSH1 0x00 +MSTORE +``` + +### Returning Recon from Memory + +Given that we have set the value "RECON" in Memory + +``` +return(0, 0x20) ## return (0, "RECON.LENGTH") +``` + +We can now return it + +``` +PUSH1 0x20 +PUSH1 0x00 +RETURN +``` + +Since mstore stores an entire word, we will return that + +This is also consistent with how ABI ENCODING works + +[TODO] Although I'm not 100% confident in this + +------- + +## Testing the contract |TODO + +We can test this contract by using EVM.Codes + +The UX is pretty good at this point + +## Deploying + +Deploying is easy, but what we'll do is going to fail + +We need to add initialization code + +## Writing initialization code + +You need to calcualte the length of the initialization code + +Offset by self + +Calculate lenght of contract code + +Add that + +Then decide which memory region in which to store said value + +Then return that + +Basically you're doing some stuff, and then returning the entire length of the contract bytecode + + + +# TODO: Experiments + +- Call to a random address, that requires no parameter and returns nothing, that does change storage so we can prove it worked + +- Real / Proper way to work with strings (padding, null terminators) + +- Call a ERC20 and transfer it | All hardcoded for simplicity + +- Working with Immutable parameters passed to the constructor \ No newline at end of file diff --git a/src/testing/initial.MD b/src/testing/initial.MD new file mode 100644 index 0000000..7035a8a --- /dev/null +++ b/src/testing/initial.MD @@ -0,0 +1,42 @@ +# Testing + +The most important aspect of testing is speed of iteration + +Do everything you can to make that faster + +- Use cached values (e.g fork-block) +- Use the faster tool + +Always ask yourself: "Am I doing this in the easiest way possible?" + +Do not accept overly complex practices, slowing your speed of iteration down will kill your results + + +## Fork Testing with Foundry + +```solidity + function setUp() public virtual { + // Select an specific block number to maintian the same state, once we want to test different scensarios, we can have varius forkIds + string memory RPC_URL = vm.envString("RPC_URL"); // RPC URL to fork, ideally Anvil with an instance of Berachain testnet + console2.log("Forking from: %s", RPC_URL); + bool USE_RPC_CACHE = vm.envOr("RPC_CACHE_BLOCK", false); + if(USE_RPC_CACHE) { + uint256 RPC_BLOCK = vm.envUint("RPC_BLOCK"); + console2.log("Forking from: %s", RPC_URL); + console2.log("At block %s", RPC_BLOCK); + forkId = vm.createSelectFork(RPC_URL, RPC_BLOCK); + } else { + forkId = vm.createSelectFork(RPC_URL); + } +``` + +## Writing Assertions + +Always write an assertion with an accompanying description script + +No matter how simple the assertion is, these can stack onto each other + +Figuring out which assertion failed when you have 50 is an exercise in futility + +Add a unique string, no matter how basic (`e.g. deposit 1`) +So that you can grep the codebase \ No newline at end of file