Programming Assignment 4: Cache Simulator
When you designed your MIPS processors, it was assumed that the CPU
accessed memory directly, and that every lw
or sw
instruction resulted in an actual memory read or write with the RAM
component. In reality, accessing memory is a fairly slow operation, and one
or more levels of high-speed cache are used to improve performace. In this
project you will implement a simple one-level cache for a MIPS simulator we
will provide. The purpose is not to make the simulator run faster (in fact,
it will run slower due to the increased overhead of simulating the cache),
but to gain insight into how caching affects the performance of a real
processor.
As a starting point, download the cache
framework. After unpacking the archive, it is ready to build
with the command make
. This produces the program
simulate
; running it without arguments will print a
description of its usage. You can run MIPS programs with the simulator,
and it will print out information about the cache performance of that
execution.
All of the cache simulation code is in cache.c
, which
initially contains just a skeleton of code that allows the simulator to
compile and run and does no simulation. Your task is to fill in the bodies
of all functions in
cache.c
to correctly simulate a cache. You will not need to
(and should not) change any of the code in the other files making up the
simulator; you will submit only your cache.c
. Also note that
the simulator already performs all of the memory operations; because this
project is about statistics gathering (and because within the simulator, the
memory and "cache" are the same speed anyway) your cache simulator will not
actually store cached data, just the overhead bits and counts of reads,
writes, hits, etc.
You will simulate an arbitrary n-way set-associative cache. Your initialization routine will be passed two values: the total number of blocks in the cache, and the set associativity (number of slots in each set). Assume each block is four words, and that the number of sets (i.e., the number of blocks divided by the number of slots in each set) will be an integer power of two (and therefore be representable in a whole number of bits).
Note that a 1-way set associative cache is equivalent to a direct-mapped cache. Also, an N-way set-associative cache containing N total blocks is equivalent to a fully-associative cache. Your simulation will have to correctly handle all valid combinations of the input parameters, including these extremes.
When you receive an address, break it into three parts: the lowest-order bits become the offset within the block, the next bits form the set identifier deciding where the block will be cached, and the remaining upper bits are the block's tag. You must compute the sizes of the respective parts of the address.
When you need to remove elements from the cache, use a least recently used (LRU) eviction policy to determine the appropriate block to evict. Don't worry about implementing a cleverly efficient approximation; just keep track of the order in which the blocks have been accessed and always evict the least-recently-used.
When writing, implement a write-allocate policy, but do not incude cache misses incurred by writes in the miss/hit rate statistics.
How to Get Started
In case you have not previously compiled C programs on Linux, you will find the following commands helpful.
- Extract the project from the tarball:
tar -xvzf pa4.tar.gz
Or you can copy the files out of
/usr/local/cs316/pa4
:
cp -R /usr/local/cs316/pa4/ ~
- Compile the project or test programs:
make
- Delete the executable and object files, leaving your source code intact
(usually not necessary;
make
will selectively rebuild the portions of your project that have changed):
make clobber
- Run the project:
./simulate <filename>
What to Submit
- Your
cache.c
file - A C source file named
direct_v_4way.c
containing a program that works poorly (<1% hits) on a direct-mapped cache but works well (>99% hits) on a 4-way set-associative cache. - A C source file named
2way_v_4way.c
containing a program that works poorly (<1% hits) on a 2-way set-associative cache, but works well (>99% hits) on a 4-way set-associative cache.
direct_v_4way.c
and
2way_v_4way.c
mentioned above, assume all caches contain 256
blocks. Remember to comment your code in all three files.
Help and Hints
Ask the TAs and consultants for help. You can contact us through the course staff mailing list or the class newsgroup. We expect to see most students in office hours during the course of the project. Extra hours will be scheduled as needed.
If you suspect a bug in the cache framework, ask the course staff for help.