EVM byte code ops guide
A guide to EVM byte code operations
Table of Contents for EVM byte code ops guide
The Ethereum Virtual Machine (EVM) has a full set of bytecodes. Knowing about these can be useful if you need to decompile a smart contract and want to figure out what was going on.
How EVM works
- The EVM is stack based. Last in, first out.
- EVM byte code is the machine language that gets executed in the EVM
- Solidity compiles to the EVM bytecode
Example of EVM bytecode
If you take the following Solidity program and compile it, you can see the byte code.
pragma solidity >=0.7.0 <0.9.0;
contract HelloWorld {
function greet() public pure returns ( string memory){
return "Hello, world";
}
}
And the compiled bytecode output from Remix (which includes some additional info):
{
"functionDebugData": {},
"generatedSources": [],
"linkReferences": {},
"object": "608060405234801561001057600080fd5b5061017c806100206000396000f3fe608060405234801561001057600080fd5b506004361061002b5760003560e01c8063cfae321714610030575b600080fd5b61003861004e565b60405161004591906100c4565b60405180910390f35b60606040518060400160405280600c81526020017f48656c6c6f2c20776f726c640000000000000000000000000000000000000000815250905090565b6000610096826100e6565b6100a081856100f1565b93506100b0818560208601610102565b6100b981610135565b840191505092915050565b600060208201905081810360008301526100de818461008b565b905092915050565b600081519050919050565b600082825260208201905092915050565b60005b83811015610120578082015181840152602081019050610105565b8381111561012f576000848401525b50505050565b6000601f19601f830116905091905056fea2646970667358221220860a91ea597e4dd4f0619cae9e7df93098ed83a3eedf084102a0d84915c4750364736f6c63430008070033",
"opcodes": "PUSH1 0x80 PUSH1 0x40 MSTORE CALLVALUE DUP1 ISZERO PUSH2 0x10 JUMPI PUSH1 0x0 DUP1 REVERT JUMPDEST POP PUSH2 0x17C DUP1 PUSH2 0x20 PUSH1 0x0 CODECOPY PUSH1 0x0 RETURN INVALID PUSH1 0x80 PUSH1 0x40 MSTORE CALLVALUE DUP1 ISZERO PUSH2 0x10 JUMPI PUSH1 0x0 DUP1 REVERT JUMPDEST POP PUSH1 0x4 CALLDATASIZE LT PUSH2 0x2B JUMPI PUSH1 0x0 CALLDATALOAD PUSH1 0xE0 SHR DUP1 PUSH4 0xCFAE3217 EQ PUSH2 0x30 JUMPI JUMPDEST PUSH1 0x0 DUP1 REVERT JUMPDEST PUSH2 0x38 PUSH2 0x4E JUMP JUMPDEST PUSH1 0x40 MLOAD PUSH2 0x45 SWAP2 SWAP1 PUSH2 0xC4 JUMP JUMPDEST PUSH1 0x40 MLOAD DUP1 SWAP2 SUB SWAP1 RETURN JUMPDEST PUSH1 0x60 PUSH1 0x40 MLOAD DUP1 PUSH1 0x40 ADD PUSH1 0x40 MSTORE DUP1 PUSH1 0xC DUP2 MSTORE PUSH1 0x20 ADD PUSH32 0x48656C6C6F2C20776F726C640000000000000000000000000000000000000000 DUP2 MSTORE POP SWAP1 POP SWAP1 JUMP JUMPDEST PUSH1 0x0 PUSH2 0x96 DUP3 PUSH2 0xE6 JUMP JUMPDEST PUSH2 0xA0 DUP2 DUP6 PUSH2 0xF1 JUMP JUMPDEST SWAP4 POP PUSH2 0xB0 DUP2 DUP6 PUSH1 0x20 DUP7 ADD PUSH2 0x102 JUMP JUMPDEST PUSH2 0xB9 DUP2 PUSH2 0x135 JUMP JUMPDEST DUP5 ADD SWAP2 POP POP SWAP3 SWAP2 POP POP JUMP JUMPDEST PUSH1 0x0 PUSH1 0x20 DUP3 ADD SWAP1 POP DUP2 DUP2 SUB PUSH1 0x0 DUP4 ADD MSTORE PUSH2 0xDE DUP2 DUP5 PUSH2 0x8B JUMP JUMPDEST SWAP1 POP SWAP3 SWAP2 POP POP JUMP JUMPDEST PUSH1 0x0 DUP2 MLOAD SWAP1 POP SWAP2 SWAP1 POP JUMP JUMPDEST PUSH1 0x0 DUP3 DUP3 MSTORE PUSH1 0x20 DUP3 ADD SWAP1 POP SWAP3 SWAP2 POP POP JUMP JUMPDEST PUSH1 0x0 JUMPDEST DUP4 DUP2 LT ISZERO PUSH2 0x120 JUMPI DUP1 DUP3 ADD MLOAD DUP2 DUP5 ADD MSTORE PUSH1 0x20 DUP2 ADD SWAP1 POP PUSH2 0x105 JUMP JUMPDEST DUP4 DUP2 GT ISZERO PUSH2 0x12F JUMPI PUSH1 0x0 DUP5 DUP5 ADD MSTORE JUMPDEST POP POP POP POP JUMP JUMPDEST PUSH1 0x0 PUSH1 0x1F NOT PUSH1 0x1F DUP4 ADD AND SWAP1 POP SWAP2 SWAP1 POP JUMP INVALID LOG2 PUSH5 0x6970667358 0x22 SLT KECCAK256 DUP7 EXP SWAP2 0xEA MSIZE PUSH31 0x4DD4F0619CAE9E7DF93098ED83A3EEDF084102A0D84915C4750364736F6C63 NUMBER STOP ADDMOD SMOD STOP CALLER ",
"sourceMap": "34:119:0:-:0;;;;;;;;;;;;;;;;;;;"
}
The Remix output includes the raw values (under object
), and also those raw values mapped to the opcodes
.
You might notice that there is no ‘Hello world’ string visible.
How to decompile Solidity bytecode into something more readable
A quick way to quickly go from bytecode to the opcodes is to paste in the bytecodes into the tool on https://ethervm.io/decompile
Putting it through that tool gives us this output, which roughly matches the original input:
contract Contract {
function main() {
memory[0x40:0x60] = 0x80;
var var0 = msg.value;
if (var0) { revert(memory[0x00:0x00]); }
memory[0x00:0x017c] = code[0x20:0x019c];
return memory[0x00:0x017c];
}
}
(the memory
part isn’t valid Solidity, shown here as pseudo code)
Each of the hex values in that original string (60806040523480156100105...
) is mapped to an equivalant opcodes (PUSH1 0x80 PUSH1 0x40 MSTORE CALLVALUE...
) or structure in that pseudocode.
Each opcode is an 8 bit unsigned int. It is quite easy to map from the bytecode to opcodes
Related links
Spotted a typo or have a suggestion to make this crypto dev article better? Please let me know!
Next post
Previous post
📙 Solidity Auditing online quiz
Learn how to audit smart contracts by looking at some example code and trying to find the bugs
⛽ Solidity Gas Optimizations Guide
How to optimize and reduce gas usage in your smart contracts in Solidity
🧪 Guide to testing with Foundry
Guide to adding testing for your Solidity contracts, using the Foundry and Forge tools
📌 Guide to UTXO
UTXO and the UTXO set (used by blockchains such as Bitcoin) explained
📐 Solidity Assembly Guide
Introduction guide to using assembly in your Solidity smart contracts
📦 Ethereum EOF format explained
Information explaining what the upcoming Ethereum EOF format is all about