Solidity assembly guide

Created on August 2022 • Tags: ethereum solidity guides

A guide to assembly in Solidity

Table of Contents for Solidity assembly guide

You can add assembly language code inline in your Solidity apps. It is sometimes useful when you want to make very precise gas saving optimisations that the built in optimiser will not do for you, or for writing code using features that isn’t supported yet in the main Solidity language.

For example, until recently to use the CREATE2 function to deploy a smart contract you had to use assembly language to do it (recent versions of Solidity have this built in, so you don’t have to touch assembly language).

Example of using inline assembly in Solidity:

Here is an example of calling the CREATE2 function, using assembly in Solidity.

It sets the variable addr to the address of the deployed contract.

Then it checks that it was set (not empty) - if it was then its an error, so it calls the revert function.

copy
contract YourContract {
  function deployAContract(bytes memory contractCode, bytes32 salt) public returns (address addr) {
    assembly {
      addr := create2(0, add(contractCode, 0x20), mload(contractCode), salt)
      if (iszero(extcodesize(addr))) {
        revert(0, 0) 
       }
    }
  }
}

The two types of assembly language in Solidity

Inline Solidity assembly this is what this blog post is talking about. It is what everyone talks about when you hear about Solidity assembly. It is used ‘inline’ within your .sol files. Its really the “Yul” language. Also known as EVM assembly, this is what everyone means when discussing assembly.
standalone Solidity assembly not disussed here. Acts as an intermediate language for a Solidity compiler before it gets converted to bytecode. Is not used with assembly {...}.

How to write inline assembly in Solidity

To write inline assembly, you just wrap it in assembly { ... }. Inside the {...} block you can add your Yul code (explained below).

note: this is really yul which I talk about further down this blog post. I will use yul and assembly interchangeably in this article

This is really handy as you can write easy to read (and easy to write) normal solidity code, but when you need that low level assembly code you can use it and it has easy access to your variables, can call functions etc.

One important thing to note: using Soldity assembly will mean you get low level access to the EVM, avoiding some of the security and safety features built into Solidity. Only use assembly if you really know what you’re doing.

The language that I keep referring to as inline assembly is really Yul (used to be known as JULIA or IULIA). It is an intermediate language that gets compiled to the bytecode for the EVM.

Note on accessing variables from a different assembly block

Note: if you have two separate blocks of assembly { ... }, they do not share/access variables from another assembly block (described in the docs as ‘different inline assembly blocks share no namespace’).

For example the following will not work:

copy
assembly { 
  let height := 2
}

assembly {
  // this will cause an error, as it cannot access height variable
  let heightAgain := height
}

Features of assembly in Solidity

assembly-local variables
- let x := add(2, 3) let y := mload(0x40) x := add(x, y)
access external variables
- function f(uint x) public { assembly { x := sub(x, 1) } }
labels
- let x := 10 repeat: x := sub(x, 1) jumpi(repeat, eq(x, 0))
loops . You don’t normally see loops in assembly language, but you do get it in Yul. They work like the following for loop

copy
assembly {
  let value
  let max = 5
  
  for { let counter := 0 } lt(counter, max) { counter := add(counter, 1) } { 
     value := add(value, counter)
  }
}

if statements
- if slt(x, 0) { x := sub(0, x) }
- there are no else statements in Yul. You have to either do multiple if statements, or a switch block with a default option (see next bullet point).
switch statements
- switch x case 0 { y := mul(x, 2) } default { y := 0 }
function calls
- function f(x) -> y { switch x case 0 { y := 1 } default { y := mul(x, f(sub(x, 1))) } }
functional-style opcodes
- mul(1, add(2, 3)) (instead of something like push1 3 push1 2 add push1 1 mul)

Example of using assembly to get better performance than regular Solidity code

The follow example (from Solidity docs) gives an example of how you can write your own assembly code to for more optimized bytecode.

note: the sumSolidity function could be optimised by wrapping in unchecked {...}

copy
// SPDX-License-Identifier: GPL-3.0
pragma solidity >=0.4.16 <0.9.0;


library VectorSum {
    // This function is less efficient because the optimizer currently fails to
    // remove the bounds checks in array access.
    function sumSolidity(uint[] memory data) public pure returns (uint sum) {
        for (uint i = 0; i < data.length; ++i)
            sum += data[i];
    }

    // We know that we only access the array in bounds, so we can avoid the check.
    // 0x20 needs to be added to an array because the first slot contains the
    // array length.
    function sumAsm(uint[] memory data) public pure returns (uint sum) {
        for (uint i = 0; i < data.length; ++i) {
            assembly {
                sum := add(sum, mload(add(add(data, 0x20), mul(i, 0x20))))
            }
        }
    }

    // Same as above, but accomplish the entire code within inline assembly.
    function sumPureAsm(uint[] memory data) public pure returns (uint sum) {
        assembly {
            // Load the length (first 32 bytes)
            let len := mload(data)

            // Skip over the length field.
            //
            // Keep temporary variable so it can be incremented in place.
            //
            // NOTE: incrementing data would result in an unusable
            //       data variable after this assembly block
            let dataElementLocation := add(data, 0x20)

            // Iterate until the bound is not met.
            for
                { let end := add(dataElementLocation, mul(len, 0x20)) }
                lt(dataElementLocation, end)
                { dataElementLocation := add(dataElementLocation, 0x20) }
            {
                sum := add(sum, mload(dataElementLocation))
            }
        }
    }
}

How to use other variables in your Solidity assembly code

You can access (read and write) variables quite easily in the assembly code.

When you assign a local variable that are referred to memory to a new value in assembly, no memory management takes place. All that happens is that variable now points to a new memory address.

When you assign a local variable that is referring to a staticically sized calldata array or calldata structs then it works in a similar way - it updates the pointer to a new memory location (and leaves the old value still in memory).

Creating an empty variable with the default value (with let)

If you need to create a variable without setting its value, you can just use let:

copy
assembly {
  let age // age = 0, as it was uninitalized
  age := 45 // but you can set it later on...
}

When you use let it creates a new stack slot. This stack slot will live for as long as the current block is being run. This is why you cannot access the variables inside the assembly { ... } block from outside the block.

Assigning values in inline Solidity assembly

You can use := to set the value in assembly. Use it with the let keyword to create a variable and set it at the same time (let varName = yourValue).

It is similar to normal Solidity, except Solidity uses just = and inline assembly uses :=.

copy
assembly {
  // creates a local variable
  let something := 4
  // ...
}

Strings in inline assembly

String literals can be a max of 32 characters.

copy
assembly {
  let thisSite := "cryptoguide.dev"
}

Important functions and opcodes…

There are quite a few Solidity assembly functions, and they often directly map to EVM opcodes. Here are some of the more common functions you might come across while writing Solidity inline assembly.

add(a, b)

Ths add function…wait for it… adds two values.

copy
function addition(uint a, uint b) public pure {
    assembly {
        let sum := add(a, b)
        // ...
    }
}

div(a,b)

Divides two numbers - a/b. There is also sdiv(a,b) signed numbers in two’s complement

mul(a,b)

Multiplies two numbers - a*b

mod(a,b)

Modulus - a%b

exp(a,b)

a to the power of b

Comparisons

lt(x,y) - returns 1 if x < y, else it returns 0
slt(x,y) - same as lt(x,y) but for signed ints in twos-compliment
gt(x,y) - returns 1 if x > y, else it returns 0
sgt(x,y) - same as gt(x,y) but for signed ints in twos-compliment
eq(x,y) - returns 1 if x == y, else it returns 0
iszero(x) - returns 1 if x == 0, else it returns 0

bitwise operations

not(x) - bitwise ‘not’ of x
and(x, y) - bitwise “and” of x and y
or(x, y) - bitwise “or” of x and y
xor(x, y) - bitwise “xor” of x and y
shl(x, y) - logical shift left y by x bits
shr(x, y) - logical shift right y by x bits
sar(x, y) - signed arithmetic shift right y by x bits

keccak256(starting_memory_location, size)

You can use this to generate the keccack256 hash of a value. It has two params, and it works like this: keccak(mem[starting_memory_location(starting_memory_location+size)))

pc() for current position in code

If you need the current position in code, use pc()

pop(x)

Use pop(x) to discard value x

mload(position)

Use mload to load data (32 bytes) from memory. For example var foo := mload(0x40) will load 32 bytes from memory position 0x40. See the section below for the significance about 0x40.

mstore(starting_memory_location, value_to_store)

If you need to store data in memory, you need the start location and the item to store.

The example below shows how to store a uint (with value of 12) in memory location 0x0:

copy
assembly {
    let result := add(4, 8)   
    mstore(0x0, result)
}

get size of memory with msize()

Use msize() to get size of memory, i.e. largest accessed memory index

smart contract specific functions

Some functions are not typical low level programming language functions, as they relate to EVM specific things:

chainid() - ID of the executing chain (EIP-1344)
basefee() - current block’s base fee (EIP-3198 and EIP-1559)
origin() - transaction sender
gasprice() - gas price of the transaction
blockhash(b) - hash of block nr b - only for last 256 blocks excluding current
coinbase() - current mining beneficiary
timestamp() - timestamp of the current block in seconds since the epoch
number() - current block number
difficulty() - difficulty of the current block
gaslimit() - block gas limit of the current block
gas() gas still available to execution
address() - address of the current contract / execution context
balance(a) - wei balance at address a
selfbalance() - equivalent to balance(address()), but cheaper
caller() - call sender (excluding delegatecall)
callvalue() - wei sent together with the current call
calldataload(p) - call data starting from position p (32 bytes)
calldatasize() size of call data in bytes
calldatacopy(t, f, s) copy s bytes from calldata at position f to mem at position t
codesize() size of the code of the current contract / execution context

creating (deploying contracts) with create and create2

There are a couple of ways to create a contract. See my guide on using CREATE2 here.

create(v, p, n) create new contract with code mem[p…(p+n)) and send v wei and return the new address; returns 0 on error
create2(v, p, n, s) create new contract with code mem[p…(p+n)) at address keccak256(0xff . this . s . keccak256(mem[p…(p+n))) and send v wei and return the new address, where 0xff is a 1 byte value, this is the current contract’s address as a 20 byte value and s is a big-endian 256-bit value; returns 0 on error

return(starting_memory_location, num_bytes_to_return)

When you need to return data from a function, you can use return(a,b).

The two params:

first is the starting memory location
and the second is the number of bytes of memory to return

Example of using return in Solidity inline code:

copy
assembly {
 // say you had existing code that stored an 
 // 8 byte value in memory address 0x0...
 
 // ...
 
 return(0x0, 8)
}

stop()

This will stop execution - it has the same effect as return(0, 0)

Want a list of all Solidity assembly language functions?

For a full list (I’ve only covered the main ones you will probably use) check out https://docs.soliditylang.org/en/v0.8.13/yul.html#evm-dialect

0x40 (free memory pointer) and reserved memory

There are a few special memory address locations, and 0x40 is one of those.

It is the free memory pointer, pointing to the end of the currently allocated memory.

You should remember to keep 0x40 in sync and update it after you write to memory.

When your smart contract is first initialised, the first four 32 bytes (128 bytes) are reserved. This is why 0x40 is always available as the free memory pointer. Initially it is set to 80 (in hex) which is 128 in decimal - the end of the currently allocated memory.

first 64 bytes (0x00 - 0x3f) are scratch space
next 32 bytes (0x40 - 0x5f) are the free memory pointer
next 32 bytes (0x60 - 0x7f) are the ‘zero slot’

From https://docs.soliditylang.org/en/v0.8.13/internals/layout_in_memory.html#layout-in-memory:

Scratch space can be used between statements (i.e. within inline assembly). The zero slot is used as initial value for dynamic memory arrays and should never be written to (the free memory pointer points to 0x80 initially). Solidity always places new objects at the free memory pointer and memory is never freed (this might change in the future).

How to add comments in inline Solidity assembly code

Same as normal solidity:

copy
assembly {
 // this is a comment
 
 /* this is 
    a multiline comment */
}

More resources to learn about assembly language in Solidity

Here are some great resource to learn assembly in Solidity.

the official docs: https://solidity-kr.readthedocs.io/ko/latest/assembly.html
Yul docs https://docs.soliditylang.org/en/latest/yul.html
https://jeancvllr.medium.com/solidity-tutorial-all-about-assembly-5acdfefde05c
evm.codes playground https://www.evm.codes/playground
a nice multi part series: https://mirror.xyz/0xB38709B8198d147cc9Ff9C133838a044d78B064B/PpA5KdQhrE_2Bf-USfKePROJ5tE-raL7_VGBR8HE39E
a nice video on it: https://www.youtube.com/watch?v=btDOvn8pLkA
look at some examples of Solidity assembly in action https://github.com/Arachnid/solidity-stringutils/blob/master/src/strings.sol
evm bytecodes
some of the snippets were based on pages on https://docs.soliditylang.org/en/v0.8.13/yul.html#evm-dialect

Notes about assembly

The rest of this article are small notes/snippets, from various articles online (starting with this - which i really recommend. One day I’ll get back to updating this article and mag it nicer. Hopefully someone will find this useful.

EVM is stack based
does not include too many instructions. Instructions can be split into stack instructions (values, moving, swapping on the stack), artihmetric (math) instructions, comparison instructions (compare two values and push 0 or 1 to the stack), bitwise instructions, memory instructions (evm memory instructions), context instructions (read/write to storage etc))

Solidity assembly Stack instructions

instructions to move values on the stack
e.g. pop to pop a value from the top of the stack
swap1…swap16 will swap the value from top of the stack with value at the stack index (from 1 to 16)
dup1…dup16 duplicates value at index num (from 1 to 16), and push it to the stack
push1…push32 - push a value to top of the stack is the size (in bytes) of the value (from 1 byte to 32 bytes)

Solidity assembly Arithmetic instructions

These are the math instructions -e.g. add push the result of adding two vals
sub push the result of subtracting two vals
mul push the result of multiplying two values
div push the result of dividing two values
you also have smul and sdiv, which are for signed ints.

Solidity assembly Comparison instructions

Compare two values (both are popped) from the stack, do a comparison, then if its true push 1 or if false push 0
eq pushes 1 to the stack if the top two values are the same. Otherwise it pushes 0
note: the iszero is used to inverse a bool, like how !someValue is used in JS.
iszero pushes 1 to the stack if the top stack value is 1. Otherwise it pushes 0
lg pushes 1 to the stack if the top value on the stack is less than the second. Otherwise it pushes 0
gt pushes 1 to the stack if the top value on the stack is greater than the second. Otherwise it pushes 0
slt and sgt are the same, but for signed integers. Otherwise it pushes 0

eq and iszero can in some cases be used interchangeably (if comparing to 0) - but most code bases aim for iszero.

copy
if iszero(numTokens) { /* ... */ }
if eq(numTokens, 0) { /* ... */ }

Solidity assembly bitwise instructions

These pop one or more values from the stack, then it performs bitwise ops on them

not does a bitwise NOT on the top stack value
and does a bitwise AND on the top two stack values
or does a bitwise OR on the top two stack values
xor does a bitwise EXCLUSIVE OR on the top two stack values
shr / shl does a bit shift right, and bit shift left

Solidity assembly memory instructions

When you want to write to memory, you will use the memory instructions

mstore(offset, value) to store a 256 bit word in memory (32 bytes). Offset is the offset in the memory (starting at 1), value is the 32 byte value to store. In assembly, offset will be the top of the stack, and value will be the 2nd from top. So if you did push1 0x01 push1 0x00 mstore, it will store 1 (0x01 - the value) in position (offset) 0 (0x00).
mload(offset) to load a 256 bit (32 bytes) word from memory. This will load it from offset, which is the offset in the memory (in bytes). A 32 byte value will be pushed to the stack.
mstore8(offset, value) to store 8 bit (1 byte) word in memory. offset is the offset in memory in bytes, starting at 1. Value is the 1 byte value to write in memory.

Solidity assembly context instructions

These are ones that read/write to global state and execution context

e.g.:

caller pushes to the top of the stack the address that called the current context. This will be a 20 byte address that did the last call (msg.sender in solidity)
staticcall to make a read only call to another contract. Quite an interesting one, read about it here. This is the same as solidity’s staticcall
timestamp pushes the current block unix timestamp (block.timestamp)
calldataload to load data from calldata (what function/args were called) into current context
sload to load data from current smart contract’s storage
sstore to store to the smart contract’s storage
log1, log2, log3, log4 adds data to transaction logs (from 1 to 4 topics)
call to call another contract
create to deploy a new contract
create2 use create2 to deploy a new contract at a known address

How to use Yul

Yul = a low level language. You can write it inline in Solidity code (can also write as its own standalone langage).
Yul has most EVM operations, which you can call as functions (so its more similar to typical programming languages, as opposed to ‘proper’ assembly)
Yul has if statements, but not else statements. If you want to use that, use a switch with case and a default case.

How storage slots work in Assembly/Yul

You can use sload to load from storage slots.

The following example shows how to use sload to load from a storage slot. The data we are access is a uint256 (32 bytes) so it takes up the entire slot, so we can simply sload the entire slot and return that.

In this example we have only one slot, so i’m loading from slot 0. Soon I’ll show how to easily get the slot which you want to load.

copy
pragma solidity >=0.8.4;

contract Example {
    uint256 someData = 1234;

    // this shows how to get storage data, and return it
    // this is a simple example - `someData` is a 256/32 byte word
    // so we can just use sload. 
    // in this example we know its slot 0, so i'm hard coding it to
    // sload from slot 0
    function getData() public view returns(uint) {
        uint data; 
        assembly {
            data := sload(0) // hard coded to load from slot 0
        }
        return data;
    }
}

Ok now time for a more complex example. Let’s say you have multiple storage variables, which might be packed together into the same slot(s). If this is not familiar then see my guide on how storage slots work

You can use .slot and .offset to figure out what slot, and what the memory offset within that slot is. The following example shows how. Note that dataA/dataB/dataC are packed together in the same slot (slot 0).

copy
pragma solidity >=0.8.4;

contract Example {
    uint32 dataA = 6; // slot 0
    uint32 dataB = 12; // slot 0
    uint64 dataC = 40; // slot 0
    uint256 dataD = 1234; // slot 1

    // this is incorrect! It will load the slot 0 (dataB.slot == 0)
    // but this contains a uint32 + uint32 + uint64
    function getDataBIncorrectly() public view returns(uint) {
        uint data; 
        assembly {
            data := sload(dataB.slot)  // loads entire slot, which has 3 packed variables
        }
        return data;
    }

    function getDataBCorrectly() public view returns(uint64) {
        uint64 data;  // what we want to return
        
        assembly {
            // the slot num (which will be 0) 
            let slotNum := dataB.slot
            
            // the offset within the slot (32 bytes) - This is in bytes
            let offsetBytes := dataB.offset // in bytes
            
            let offsetBits := mul(offsetBytes, 8) // turn bytes in bits
            
            let entireSlot := sload(dataA.slot) // the full (packed) slot

            let shiftedByOffset := shr(offsetBits, entireSlot) // right shift slot by 32 bytes
            
            data := shiftedByOffset
        }
        return data; // returning a uint64, and as its been right shifted this is the correct value
    }
}

Storing data using sstore works in a similar way. Be careful when doing this - especially if the storage slot is packed as it will overwrite the entire slot.

copy
  function setDataA(uint256 newVal) external {
      assembly {
          sstore(dataA.slot, newVal)
      }
  }

If you are going to sstore a packed storage slot, use bit masking (load the existing entire slot, and set only the bits the packed storage data is in (via .offset))

Note, you can sload and sstore in a slot that wasn’t already defined. In otherwords, in your solidity contract you may have defined one storage variable (E.g. uint32 dataA = 6;). But you could still try to sload(9000) - storage slot 9000 would be empty, so it would return 0. Likewise, you can also sstore(9000, someValue) - and it will work! It will store it in that slot, even though nothing was defined in your solidity contract for that slot.

Types in Yul

There is really only one type in in Yul - a 32 byte word.

returning a bytes32 as a string in Yul & Solidity

If you have some data stored in a byte32, but you want to return a string you can convert it using abi.encode like this:

copy
pragma solidity >=0.8.4;

contract Example {
    function returnAString() public pure returns(string memory) {
        bytes32 yourName;
        assembly {
            yourName := "Fred"
        }
        return string(abi.encode(yourName));
    }
}

Further links

Check out my guide on storage in Solidity which has a big section on reading/writing to or from storage in Yul/assembly.

Spotted a typo or have a suggestion to make this crypto dev article better? Please let me know!

Using CREATE2 to deploy contracts at known addresses

Guide to Solidity ABIs

See all posts (70+ more)

Learn how to audit smart contracts by looking at some example code and trying to find the bugs

How to optimize and reduce gas usage in your smart contracts in Solidity

Guide to adding testing for your Solidity contracts, using the Foundry and Forge tools

UTXO and the UTXO set (used by blockchains such as Bitcoin) explained

Introduction guide to using assembly in your Solidity smart contracts

Information explaining what the upcoming Ethereum EOF format is all about

See all posts (70+ more)