Calling Functions in Assembly

Pseudo-Instructions

While assembly languages mostly have a 1-1 correspondence to some processor’s machine code, sometimes it’s helpful for the assembly language to have a few convenient features that just make it easier for humans to read and write. The primary such feature in RISC-V assembly is its pseudo-instructions. A pseudo-instruction is an assembly-language instruction that does not actually correspond to any distinct machine-code instruction (with its own opcode and such).

Here are some common pseudo-instructions:

mv rd, rs1: Copy the value of register rs1 into register rd.
li rd, imm: Put the immediate value imm into register rd.
nop: A no-op: do nothing at all.

All three of these pseudo-instructions are equivalent to special cases of the addi instructions:

mv rd, rs1 does the same thing as addi rd, rs1, 0
li rd, imm is addi rd, x0, imm
nop is addi x0, x0, 0

Try to convince yourself that these addi instructions do in fact work to implement these pseudo-instructions’ semantics.

The RISC-V assembler translates pseudo-instructions into their equivalent real instructions for you. So you can write li x11, 42 and that will translate to exactly the same machine-code bits as addi x11, x0, 42.

Why doesn’t RISC-V implement these pseudo-instructions as real, distinct instructions? By keeping the number of instructions small, it simplifies the hardware—especially the decode stage—making it smaller, faster, and more efficient.

Functions in Assembly

With branching control flow, we can accomplish a lot in RISC-V assembly. We can “fake” if statements, for loops, and so on. But one thing we can’t do yet is call functions. That’s what this lecture is about.

Here’s an example C program we can work with:


int mult(int mcand, int mlier) {
    int product = 0;
    while (mlier > 0) {
        product = product + mcand;
        mlier = mlier - 1;
    }
    return product;
}

int main() {
    int i, j, k, m;
    // ...
    i = mult(i, k);
    m = mult(i, i);
    // ...
}

You already know how to implement the body of the mult function in RISC-V. But nothing we’ve done so far will let us call that code multiple times with different arguments, as main does in this example.

Calling a function is a multi-step process, and it requires collaboration between both the caller code and the callee code (the function being called). At a high level, every function call needs to follow these steps:

The caller puts arguments in a place where the callee function can access them.
The caller transfers control to the callee (i.e., it jumps to the first instruction in the function).
The function creates a stack frame to hold its own local variables.
The function actually does stuff: i.e., the function body.
The function puts the return value in a place where caller can access it. It also restores any registers it used to the state the caller expects. And finally, it releases the stack frame that holds its local variables.
The callee returns control to the caller (i.e., jumps to the next instruction in the caller right after the function call).

The caller and callee need to agree on all the details for how this multi-step process works. For example, they must agree on which registers hold the arguments and which registers hold the return value. A standardized protocol for how to implement all these details is called a calling convention. The RISC-V ISA itself defines a particular calling convention, which we will learn about in this lecture. C compilers that generate RISC-V code also use the same calling convention to implement function definitions and function calls—and because it’s standardized, even functions compiled by different C compilers can call each other.

The RISC-V Calling Convention

We’ll break down the components next, but here are the most important parts of the RISC-V calling convention:

Arguments go in registers a0 through a7 (a.k.a. x10 through x17). (In fact, that is why these registers have an alternative name starting with an “a”! It’s for argument.)
Return values also go in registers a0 and a1. (Yes, this means that functions overwrite their arguments with their return values before they return.)
Register ra (a.k.a. x1) holds the return address: the address of the next instruction to run after the function call finishes.
Registers s0 through s11 (a.k.a. x8, x9, and x18 through x27) are callee-saved registers. This means that callers can safely expect that, after they make a call and the call returns, the registers will be carefully restored to the value they had before the call.

Control Flow for Call and Return

Let’s start with the basic mechanism for transferring control: jumping from the caller to the callee and then back. The interesting thing is that the [branch instructions we’ve seen so far][ctr], such as beq, won’t suffice. The problem is that functions, by their very nature, can be called from multiple locations. Like in our example above:


i = mult(i, k);
m = mult(i, i);

Imagine that we implemented both of these calls with a plain unconditional jump, j, like this. Then the calls might look like this:


mv a0, <register containing i>;
mv a1, <register containing k>;
j mult;
mv <register containing i>, a0;

mv a0, <register containing i>;
mv a1, <register containing i>;
j mult;
mv <register containing m>, a0;

All those mv instructions would take care of setting up the argument registers and consuming the return-value register. We imagine here that mult is an assembly-language label that points to the start of the mult function’s instructions.

There’s a problem. In the implementation of the mult function, how do we know where to jump back to? After each call is done, we need to transfer control to the next instruction after the jump. Even if we inserted labels on those instructions, if there is only a single block of instructions to implement mult, those instructions would need to contain j <label> to return. But somehow it would need to pick a different label for each call, which is impossible!

The solution is to designate a register to hold the return address for the call. Instead of just using j to call a function, we’ll do two things:

Record the next instruction’s address as the return address, in register ra.
Jump to the first instruction of the called function.

Then, to return, the function just needs to jump to the instruction address in register ra. Regardless of who called the function, doing this will suffice to transfer control to the point right after the call.

RISC-V has instructions to support these strategies: both the call and the return. For the call, you use the jal instruction (the mnemonic stands for jump and link):


jal rd, label

The jal instruction does the two things we need for a call:

Put the address of the next instruction after the jal into register rd.
Unconditionally jump to label.

So our function calls will generally look like jal ra, <function label>. Then, to return from a function, we’ll use the jr instruction (the mnemonic means jump register):


jr rs1

The jr unconditionally jumps to the address stored in the register rs1. So function returns generally look like jr ra.

In fact, this pattern is so common that RISC-V has pseudo-instructions for function calls and returns:

jal label: short for jal ra, label
call label: like the above, but with an extra auipc instruction so it supports larger PC offsets
ret: short for jr ra

(Going one level deeper, it turns out that jr rs1 is itself a pseudo-instruction that is short for jalr x0, 0(rs1). But that’s not really important for learning about function calls.)

Managing the Stack

Beyond just jumping around, functions also have another important responsibility: they need to keep track of the their local variables. As you already know, local variables go in stack frames on the call stack. You also know that the stack is a region in memory grows downward (from higher memory addresses to lower ones) when we call functions, and it shrinks when function calls return. This section is about the bookkeeping that functions must to do create and use their stack frames.

The central idea is that we must use a register to keep track of the address of our current stack frame. According to the RISC-V calling convention, register sp (a.k.a. x2) contains the address of the bottom (the smallest address) of the current stack frame. Code interacts with sp in three main ways:

At the beginning of the function, it will move sp downward to make space for its own stack frame. Remember, this stack frame will contain the function’s local variables.
During the execution of the function, it will use (positive) offsets on sp to locate each of its local variables. So you’ll see stuff like ld a7, 16(sp) and sd a9, 40(sp) to load and store local variables using offsets from sp.
At the end of the function, before it returns, it will move sp back up to wherever it used to be, “destroying” its stack frame. No memory literally gets destroyed, of course, but adjusting sp back to its pre-call value indicates that we’re done using all our local variables, and it lets the caller locate its own stack frame.

This means that functions usually look like this:


func_label:
  addi sp, sp, -8
  ...
  addi sp, sp, 8
  ret

The addi at the top and bottom of the function “creates” and “destroys” the stack frame. The function’s code must know how big its stack frame needs to be: in this case, it’s 8 bytes, so we move the stack pointer down by 8 bytes at the beginning and back up by the same 8 bytes at the end. The stack frame size needs to be big enough to contain the function’s local variables; C compilers compute this stack-frame size for you by adding up the size of all the local variables you declare.

Notes TK:

a more complete example of a leaf function
saving & restoring ra & sp
caller-/callee-saved registers
an even more complete example of a function with a call
recursive functions

CS 3410