CPU Simulation

Instructions: Remember, all assignments in CS 3410 are individual. You must submit work that is 100% your own. Remember to ask for help from the CS 3410 staff in office hours or on Ed! If you discuss the assignment with anyone else, be careful not to share your actual work, and include an acknowledgment of the discussion in a collaboration.txt file along with your submission.

The assignment is due via Gradescope at 11:59pm on the due date indicated on the schedule.

Submission Requirements
Restrictions
Provided Files
Submission
Overview
Assignment Outline
Implementation
Submission
Rubric

Submission Requirements

For this assignment, there are two files to submit:

logic.c with the listed functions all implemented.
tests.txt with all of your test cases in the correct format.

Restrictions

You may not use any additional include directives.
Please only change logic.c. Changes to other files will not be reflected in your submission.

Provided Files

The following files are provided in the release:

logic.c, which includes the five functions you will implement in this assignment.
runner.c, which handles I/O and the structure of the simulator.
hash_table.c, an implementation of a simple hash table.
hash_table.h, which is the above’s associated header file.
sol.h, which includes the signatures of the functions in logic.c and hash_table.c, as well as useful define macros and variable declarations.
Makefile, which will appropriately compile and link the above to produce an executable, runner.
check.s, a simple assembly program to be used as a sanity check.
check.bin, the input to the program, which is the result of assembling check.s.

The only file among these that you will modify is logic.c.

Submission

As always, you will find the starter in your personal repository on GitHub. You will submit your solution on Gradescope.

Overview

In this assignment, you will implement a subset of the RISC-V 64 instruction set. In order to gain a better understanding of control logic, processor architecture, and how assembly language functions, you will simulate the steps—Fetch, Decode, eXecute, Memory, Writeback—of a simple single-cycle processor. You can read more about these steps in the lecture notes.

The program takes assembled RISC-V machine code into standard input. We handle the I/O and break the instructions down into an array of uint32_t values, named instructions. instructions[0] has the 32-bit encoding for the first instruction, and generally, instructions[PC / 4] has the 32-bit encoding for the PC / 4 + 1st instruction (i.e., the instruction at address PC in the input file). The instruction encodings follow the standard that is specified in the RISC-V ISA manual.

After the instructions are fed into the program, while the program counter (divided by 4) is less than the static instruction count, it will continuously, in order, call the functions fetch(), decode(), execute(), memory(), and writeback().

Each of these 5 functions passes information to the next stage. fetch() will pass the current instruction to decode(), which will pass relevant information to execute(), which will pass other information to memory(), which will pass more information to writeback(), which will update the registers and the program counter. The relevant information is stored in a struct called info, which has 4 integers. It is up to you to decide exactly what information to store in the info struct, and not every stage will need all the bits.

The info struct is meant as a container for arbitrary bits. There is no single correct way to use its fields to represent the relevant state. You will use the info struct in entirely different ways for each of the four stage → stage communication steps.

The 32 general-purpose registers are simulated as an array of 32 uint64_ts. The starter code initializes all of these to 0.

Memory is simulated as a hash table, data, that maps from uint64_t to uint64_t. The keys are addresses, and the values are the data stored in memory. We suggest mapping an address to one byte of data, but an alternative such as mapping addresses to four or eight bytes is also acceptable.

An implementation of a hash table is provided in hash_table.c and hash_table.h. All key (address) → value (data) mappings are effectively initialized to 0, as the ht_get() function returns 0 when the key is not found.

Use the little-endian byte order for your simulated memory. For example, when storing an 8-byte value to address a, store the least-significant byte at a and the most-significant byte at address a+7.

Assignment Outline

Work out a high-level plan and implement addi and andi, detailed in Task 0.
Implement the rest of the instruction subset, detailed in Task 1.
Create a thorough test suite that you will submit, specified in Task 2.

Implementation

The release code is found on GitHub.

Task 0: Getting Started in Lab

Task 0.0: Design Plan

As stated in the overview, one of the goals of the assignment is to familiarize yourself with the important steps in a simple five-stage processor. The figure below may be used as reference.

Processor diagram

The five stages of the processor that you simulate are:

Fetch an instruction from main memory. The PC (Program Counter) holds the address of the instruction to fetch.
Decode the instruction into relevant parts that the processor understands, and read the requested register(s). Things to consider: What info is important to extract from an instruction? How should we generate the correct immediate value from the bits in the instruction? How do we single out bits that differentiate instructions—what makes lw different from sw or from sb?
Execute the instruction to determine its result value.
Access memory (here simulated as a hash table) to load/store data, if necessary. Things to consider: How should the stage differentiate bytes vs. words vs. double words? When should this stage sign-extend or zero-extend values when loading and storing?
Write back a new value for the PC, which should—except in the case of a branch—increment by 4 after every cycle, since each instruction is expressed with 4 bytes. Also, write back a newly computed value to the register file, if necessary. Things to consider: When should we write to the register file at all? What should we increment the PC by?

Create a high-level plan for what each function should do and what information it should pass to the next stage. For example, the Memory stage is the only one that accesses memory, and the Decode stage will be the only one that deals with bit-level slicing of the actual instruction word.

While it would certainly be possible to simulate everything in one function, implementations that are not faithful to the purpose of each stage will incur penalties.

Task 0.1: `addi` and `andi`

Now that you have a plan, let’s walk through two instructions.

addi rd, rs1, imm is implemented as:
Registers[rd] = Registers[rs1] + Sign-extend(imm)
andi rd, rs1, imm is implemented as:
Registers[rd] = Registers[rs1] & Sign-extend(imm)

Consult the RISC-V reference card to see the encodings for these instructions. Both addi and andi are I-type instructions, and thus have this encoding:

31 – 20	19 – 15	14 – 12	11 – 7	6 – 0
imm[11:0]	rs1	funct3	rd	opcode

The reference also tells us the values of the opcode and funct3 fields:

Instruction	opcode	funct3
`addi`	`0010011`	`000`
`andi`	`0010011`	`111`

The fetch stage will get the instruction at index PC / 4. Then, for addi and andi instructions, the argument to the decode stage will be a uint32_t whose binary is of one of the two following forms:

0b[XXXXXXXXXXXX][XXXXX][000][XXXXX][0010011]
0b[XXXXXXXXXXXX][XXXXX][111][XXXXX][0010011]

Using bitwise operators, differentiate between the two functions and extract the relevant pieces of information to send to the execute stage.

Hint: Consider using one of the integers in info to communicate which instruction it is. We provide a mapping from instructions to integers via the #define macros in sol.h.

Now, in execute, we will use the operands to compute the result. Since neither addi nor andi should use the memory stage, think about what information the writeback stage will need, and send this to memory, which will be a no-op.

After using the memory stage to send the information from execute to writeback, consider how your writeback stage should update the state of the program to prepare it for the next instruction.

Trying It Out

To test your implementation, we can write a simple assembly program, prog.s, using addi and andi. It could look something like this:

addi ra,zero,0x155
andi sp,ra,0x1b9

In order to obtain the binary to be used as standard input, run either of the two following equivalent commands that assemble prog.s to machine code and copy its contents as raw binary to prog.bin:

Option 1: asbin prog.s
Option 2: as prog.s -o tmp.o && objcopy tmp.o -O binary prog.bin && rm tmp.o

(Option 1 works because we have provided, in the CS 3410 container, a shorthand script asbin that just runs the commands in Option 2.)

Compile your simulator with make, producing an executable named runner. Now you can run the program with prog.bin as standard input with:

qemu runner < prog.bin

Upon successful execution of runner, the values of the 32 general purpose registers will be printed in hexadecimal.

Testing Routine

To summarize, here are the commands to run if you want to execute your simulator on an assembly program:

$ rv make
$ rv asbin your_great_test_program.s
$ rv qemu runner < your_great_test_program.bin

As always, you can use the rv alias to run commands in the official CS 3410 container.

Task 1: Simulating a RISC-V CPU

Now that you have addi and andi working, implement the remainder of the RISC-V 64 subset listed in the table:

Format	Instructions
R-type	ADD, SUB, AND, SLT, SLL, SRA
I-type	ADDI, ANDI, LD, LW, LB
S-type	SD, SW, SB
U-type	LUI
B-type	BEQ

In the official RISC-V ISA manual, these instructions are part of the RV64I Base Integer Instruction Set, a superset of RV32I (Chapters 2 and 4). A table with the encodings is in Chapter 19. You can also use the shorter reference card, which only includes the RV32 instructions, or use the extended handout (recommended) for RV64 instructions.

For the purposes of testing, command line arguments of the form <register number>@<hexadecimal value> set the starting values of individual registers. For example, to set the initial value of register 5 to 0xbeefdeadbeef and the initial value of register 12 to 0xc, the command would be

qemu runner 5@0xbeefdeadbeef 12@0xc < prog.bin

In the release files, we provide a basic test, check.s, and the output of asbin check.s, check.bin. This is also the sanity check that the autograder will run upon submission.

Behavior of BEQ

The RISC-V assembler lets you write beq instructions in two different ways: with labels or write immediate addresses. Because of an assembler quirk, we recommend that you only use labels.

Here’s some more detail. The assembler will convert an instruction of the form beq rs1, rs2, z where z is an immediate address into a sequence of two instructions: bne followed by a jal. This behavior allows assembly programmers to use beq as a pseudoinstruction for jumps beyond what can be done in one actual machine beq instruction. (The addresses of the instructions are not known until linking, so the assembler does not know if the immediate in the beq instruction is within range). We do not expect you to implement bne or jal in this assignment, so we need to write assembly programs to avoid this “convenient” behavior.

Instead, to ensure that the assembler encodes an actual beq instruction, we can use labels with optional offsets. Write your beq instructions in one of these forms:

beq rs1, rs2, L1 where L1 is a label at the instruction you want to jump to.
beq rs1, rs2, start + imm where start is a label at the very start of the program and imm is the offset (in bytes) of the instruction you want to jump to.

The two following assembly programs, for example, are equivalent and use beq in the correct manner:

Option 1	Option 2
`addi t0,zero,1 addi t1,zero,2 equal: addi t0,t0,2 addi t1,t1,1 beq t0,t1,equal add t2,t1,t0`	`start: addi t0,zero,1 addi t1,zero,2 addi t0,t0,2 addi t1,t1,1 beq t0,t1,start + 8 add t2,t1,t0`

The label at equal points to the same location as an offset of 8 bytes (2 instructions) from a label at start.

Task 2: Test Case Submission

Even with this reduced subset of the RISC-V 64 instruction set, there is still plenty of complicated behavior. We suggest writing many test cases to ensure the correctness of your program.

In addition to your implementation in logic.c, you will submit a test suite in tests.txt.

Each test should begin with a line for the additional command-line arguments: CMDS: <arg_0> ... <arg_n>, followed by the assembly for the test case. The last line should have the non-zero outputs in the same format as the command-line arguments: OUTS: <out_0> ... <out_n>.

For example, the following adheres to this format:

CMDS:
addi ra,zero,0x155
andi sp,ra,0x1b9
OUTS: 1@0x155 2@0x111

CMDS: 8@0xbeef 2@0xbee 9@0xef
addi x8,  x8, 9
add x1, x8, x9
add x1, x1, x2
OUTS: 1@0xcbd5 2@0xbee 9@0xef 8@0xbef8

Your tests should cover both basic and edge cases for all of the required instructions. You should have at least 15.

Submission

On Gradescope, submit logic.c and tests.txt.

Rubric

logic.c: 75 points
tests.txt: 25 points