When you designed your MIPS processors, it was assumed that the CPU
accessed memory directly, and that every lw
or sw
instruction resulted in an actual memory read or write with the RAM
component. In reality, accessing memory is a fairly slow operation, and one
or more levels of high-speed cache are used to improve performace. In this
project you will implement a simple one-level cache for a MIPS simulator we
will provide. The purpose is not to make the simulator run faster (in fact,
it will run slower due to the increased overhead of simulating the cache),
but to gain insight into how caching affects the performance of a real
processor.
As a starting point, download the cache
framework. After unpacking the archive, it is ready to build
with the command make
. This produces the program
simulate
; running it without arguments will print a
description of its usage. You can run MIPS programs with the simulator,
and it will print out information about the cache performance of that
execution.
All of the cache simulation code is in cache.c
, which
initially contains just a skeleton of code that allows the simulator to
compile and run and does no simulation. Your task is to fill in the bodies
of all functions in
cache.c
to correctly simulate a cache. You will not need to
(and should not) change any of the code in the other files making up the
simulator; you will submit only your cache.c
. Also note that
the simulator already performs all of the memory operations; because this
project is about statistics gathering (and because within the simulator, the
memory and "cache" are the same speed anyway) your cache simulator will not
actually store cached data, just the overhead bits and counts of reads,
writes, hits, etc.
You will simulate an arbitrary n-way set-associative cache. Your initialization routine will be passed three values: the number of words per block, the total number of blocks in the cache, and the set associativity (number of slots in each set). Assume the number of sets (i.e., the number of blocks divided by the number of slots in each set) will be an integer power of two (and therefore be representable in a whole number of bits). Also, assume the number of words per block is a power of two.
Note that a 1-way set associative cache is equivalent to a direct-mapped cache. Also, an N-way set-associative cache containing N total blocks is equivalent to a fully-associative cache. Your simulation will have to correctly handle all valid combinations of the input parameters, including these extremes.
When you receive an address, break it into three parts: the lowest-order bits become the offset within the block, the next bits form the set identifier deciding where the block will be cached, and the remaining upper bits are the block's tag. You must compute the sizes of the respective parts of the address.
When you need to remove elements from the cache, use a least recently used (LRU) eviction policy to determine the appropriate block to evict. Don't worry about implementing a cleverly efficient approximation; just keep track of the order in which the blocks have been accessed and always evict the least-recently-used.
When writing, implement a write-allocate policy. Do not incude cache misses incurred by writes in the hits, misses, and miss/hit rate statistics.
Some Hints
- Remember that memory is byte addressed; i.e., a memory address corresponds to a single byte in memory.
- In cache.c you should not be #including any files. In particular, you should not use math.h. Remember that you will be taking logarithms of perfect powers of 2.
- To help you debug your code here approximate ranges for the hit and
miss counts of the solution on the test programs supplied with the
assignment. These counts are for the default cache size and set
associativity.
Hits
Misses
test1
<5
<5
test2
30-40
<5
test3
8900-9100
<5
test4
20000-22000
1900-2100
test5
100-150
<5
test6
14-15 million
1-2 million
How to Get Started
In case you have not previously compiled C programs on Linux, you will
find the following commands helpful.
- Extract the project from the tarball:
tar -xvzf cache.tar.gz
- Compile the project or test programs:
make
- Delete the executable and object files, leaving your source code intact
(usually not necessary;
make
will selectively rebuild the portions of your project that have changed):
make clobber
- Run the project:
./simulate <filename>
What to Submit
- Your
cache.c
file - A C source file named
direct_v_4way.c
containing a program that works poorly (<1% hits) on a direct-mapped cache but works well (>99% hits) on a 4-way set-associative cache. - A C source file named
4way_v_8way.c
containing a program that works poorly (<1% hits) on a 4-way set-associative cache, but works well (>99% hits) on a 8-way set-associative cache.
Help
Ask the TAs and consultants for help. You can contact us through the course staff mailing list or the class newsgroup. We expect to see most students in office hours during the course of the project. Extra hours will be scheduled as needed.
If you suspect a bug in the cache framework, ask the course staff for help.