What is the new "Ethereum Object Format" (EOF)?
A guide to how the new Ethereum Object Format (EOF) will work, and what advantages it brings to deploying smart contracts on the EVM
Table of Contents for What is the new "Ethereum Object Format" (EOF)?
What is the Ethereum Object Format (EOF)?
There is much talk right now (November 2022) about the upcoming Ethereum Object Format (also known as “EOF”).
This is going to introduce one of the largest changes to the EVM so far, and adds some significant long-term features for the Ethereum blockchain.
Currently deployed smart contracts on EVM have no predefined structure, validation happens at runtime (and not at deployment), and there is no easy way to add versioning (such as when EVM adds/deprecates a feature).
EOF is going to add a new way to send the bytecode when deploying a smart contract. If this bytecode starts with 0xEF
, it will be treated as an Ethereum Object Format deployment.
This has a different data structure and allows validation at deploy time to check the contract is valid. This has a few advantages - including a more optimized runtime as validation at runtime is not needed for some actions (because validation ran at deployment).
- If the contract does not pass validation, the deployment will be rejected
- It is (mostly) backwards compatible - after EOF is introduced you can still deploy contracts as you do now (validation only happens if the deployed code starts with
0xEF
bytes)
It also introduces a way to easily separate code and data - this has many benefits for L2s.
And as there is validation at deploy time, combined with some new JUMP opcodes it can make the runtime more optimized (no validation required when doing JUMPs at runtime, with some newly proposed opcodes for jumps).
The main features introduced in Ethereum Object Format (EOF) include:
- This proposal will allow stricter checks to be done at deploy time, instead of at runtime
- This means the interpreter can run faster, and we can catch issues before a contract even hits the blockchain.
- Static jumps can be introduced with relative addresses
- Introducing jump tables, which can be validated at deploy time & are more optimized than the
JUMPDEST
s that we currently have.- It may also mean
JUMPDEST
can be removed (although I’m not sure if/when that will happen).
- It may also mean
- Can add a
JUMPDEST
table (which can be analyzed at deploy time to look for errors) - In theory this means that
JUMPDEST
can be fully removed if we have this table (not sure it will be though) - Adds potential for supporting things such as account abstraction in the future (proposal)
Associated EIPs summary
Here is a summary of the main EIPs associated with EOF. If you want a more in-depth look into the structure of deploying contracts with EOF, keep reading past this section.
EIP-3540 - the specs for the EOF format
This is the EIP that defines how the container is organized.
A quick summary (read further down for more details):
- 2-byte
0xEF00
magic header - 1-byte version (starts at version 1)
- Then there are 1 or more section headers, each of which contains:
- a
uint8
‘section_kind’ (two types are introduced in this EIP:1
(code) and2
(data)), - and the uint16
section_size
for the size of the section
- a
EIP-3541 - rejects deploying of new contracts that start with 0xEF (live already)
As of May 2021, there were no deployed contracts starting with the 0xEF
bytes. (Even if there were some deployed, it would not have worked anyway).
EIP-3541 was introduced to prevent any new contracts from being deployed with these bytes at the start of the code. This went live in the London Ethereum upgrade
If we didn’t have this one, there might be some (pre-EOF) contracts that get interpreted as being in the EOF format.
Then when EOF is going live, we know that there will be no contracts that exist with the 0xEF
header. This is the only EIP listed on this post that is in the final stage - all others are still proposals.
EIP-3670 Validating when deploying new contracts, that they pass validation
EIP-3670 is the EIP that will use the EOF container data (described in EIP-3540) and gives a mechanism to validate if code (about to be deployed) is valid (such as having valid jumps, the code terminates in an expected way, etc), and does not use deprecated features.
During validation, deployment will be rejected if it contains things such as:
- undefined instructions
- note: before EOF, opcode of
0xEF
was undefined. But when EOF goes live,0xEF
is considered valid if it is the first 2 bytes of the code.
- note: before EOF, opcode of
- code ending with incomplete
PUSH
instruction - if the last opcode (terminating instruction) is not
STOP
(0x00
),RETURN
(0xF3
),REVERT
(0xFD
),INVALID
(0xFE
) orSELFDESTRUCT
(0xFF
). - If the section headers (defined in EIP-3540) have 0 headers in them (there must be at least one).
This will be introduced at the same time as EIP-3540.
note: this validation only happens when deploying a new contract that starts with the 0xEF
bytes.
If it does not start with these (like all current (pre-EOF) contracts), then validation does not run.
note: this validation will happen for both CREATE
and CREATE2
EIP-4200 - add static control jumps
There are two new opcodes in this proposal - RJUMP
and RJUMPI
. These are like a cheaper version of the regular JUMP
opcodes, because they do not do any runtime validation (this happens at deployment). They take one operand, which is where to jump to relative to the op
Currently, the EVM only supports dynamic jumps, which have the benefit of being very flexible. However, they make static code analysis much harder and also require the use of a jump destination marker (JUMPDEST
). The proposed changes in this EIP add RJUMP
and RJUMPI
, which encode the jump destination as a signed relative value.
Unlike existing jumps, which require a destination, the new RJUMP
/RJUMPI
use relative offsets (signed int16
s).
Part of EIP-3670 will be to validate that the RJUMP
/RJUMPI
have valid values (pointing to a valid instruction - and not pointing to things like data of a PUSH
opcode, or outside of code bounds).
Note: Although not needed, they can point to a JUMPDEST
marker.
Using these new opcodes is cheaper than the existing jumps
RJUMP
will cost 5 gas (saves 2 gas),- and
RJUMPI
will cost 7 gas (also saving 2 gas) - The gas is cheaper as the validation that they are valid is done at deploy time, not at runtime.
EIP-4200 will be introduced in the EVM at the same time as EIP-3540.
EIP-4750 EVM functions/subroutines
A couple more opcodes are proposed in this EIP - CALLF
and RETF
.
This can be used like subroutines (aka “EVM functions”).
It has been proposed that by combining this and EIP-4200 (RJUMP
/RJUMPI
) that dynamic jumps (JUMP
/JUMPI
) could be fully replaced
This proposal will also help with static analysis, as it won’t require JUMPDEST
analysis (which is quite expensive).
EIP-5450 Deploy-time validation of stack usage for EOF functions.
This proposal will introduce a way to validate at deploy time the maximum stack size & check if an underflow will happen. (Deployment will fail if it expects an underflow). The max stack size can then be used to check for overflows, which will improve interpreter speed a little.
The current (pre-EOF) EMV implementation checks for stack overflow/underflow, checks for gas, and more. In EIP-5450 it will introduce changes to reduce the need for those checks at runtime, by verifying at deploy time that these issues cannot happen. (And it will reject deployment of a new contract if it does not pass these validation checks).
Note: although there are stack overflow checks at deploy time, there is no way to guarantee that with EIP-5450, so there are still going to be runtime checks for stack overflows.
What is the EOF container data structure, and how it works in more detail
EOF container data structure is a binary format:
It starts with:
- first two bytes are the magic string (value of
0xEF00
). - note: The
0xEF
byte was chosen because it resembles Executable Format. It was also chosen as it was an undefined opcode (so should be quite safe to introduce here - back in May 2021 analysis was done and there were 0 deployed contracts on mainnet which started with0xEF
) - Then there is a 1 byte (value of
0x01-0xFF
) for the EOF version number (which starts at 1).
Then there is at least one section header(s). There are two fields in each section header.
- section_kind is a 1 byte field - an uint8 -, with a value of
0x01-0xFF
. (note: cannot be0x00
).- value of
0
is invalid (reserved) - value of
1
is for code - value of
2
is for data - value of
3
(introduced in EIP-4750) for a function
- value of
- section_size, a 2 byte (uint16), value between
0x0001 - 0xFFFF
. (note: cannot be0x00
)
You can have multiple section headers (minimum of 1 though). Once all headers are done, then there is the section headers terminator byte (0x00
)
Is EOF Backwards compatible?
Technically this is not backwards compatible, as once this goes live then you cannot deploy code that starts with 0xEF
(unless it follows EOF spec).
But, a contract that had 0xEF
as the first byte would not be a valid smart contract anyway, so the impact is minimal and it is largely considered that the proposal for EOF is backwards compatible (even if technically it isn’t).
When will the Ethereum Object Format (EOF) go live?
The initial version of EOF is expected to go live in the Shanghai Upgrade, which is pencilled in for September 2023
Read more
Interested in reading more about EOF? Here are some links I found useful:
- Video on Ethereum Object Format
- The initial EIP-3540 is a great read if you want a more in-depth introduction
- Notes on how account abstraction will be possible via EOF
- EIP-3541: link
- EIP-3670: link EOF Code validation
- EIP-4200: link Static relative jumps
- EIP-4750: link EOF Functions
- EIP-5450: link EOF Stack validation
- There is a great introduction thread by @lightclients on Twitter.
- Magicians forum link
- Nice guide on Ethereum.org
Spotted a typo or have a suggestion to make this crypto dev article better? Please let me know!
📙 Solidity Auditing online quiz
Learn how to audit smart contracts by looking at some example code and trying to find the bugs
⛽ Solidity Gas Optimizations Guide
How to optimize and reduce gas usage in your smart contracts in Solidity
🧪 Guide to testing with Foundry
Guide to adding testing for your Solidity contracts, using the Foundry and Forge tools
📌 Guide to UTXO
UTXO and the UTXO set (used by blockchains such as Bitcoin) explained
📐 Solidity Assembly Guide
Introduction guide to using assembly in your Solidity smart contracts
📦 Ethereum EOF format explained
Information explaining what the upcoming Ethereum EOF format is all about