CS3410 Fall 2016
Due: 11:59pm, Tuesday, November 22, 2016
Reminder: you must work in a group of two for this project. You don't need to have the same partner as the earlier projects. You should keep the same partner for this and Project 8, however.
In this assignment, you will read over the specs of a dynamic memory allocation library that supports the allocation, freeing, and resizing of blocks of memory. You will write a suite of tests that can be used to test an implementation of a dynamic memory allocation library and determine if it was implemenated to spec, correctly, and robustly. You will need to put your problem-solving skills to work!
You can download the files for this project here.
This section describes the specifications of the dynamic memory allocation library you will write for Project 8. You are not writing it now! Instead, you will write a series of tests that check to see whether an implementation of this library was done correctly. For your enjoyment, we have prepared approximately 13 broken libraries that do not meet the specs in some way. Your task is to write tests that accurately flag these broken versions as broken while at the same time accurately flagging a correct implementation as correct.
By now you are familiar with static vs dynamic allocation. Statically allocated memory is allocated on the stack and exists only for the life of the function in which it was allocated; dynamically allocated memory is allocated on the heap and exists from the moment it is allocated until it is explicitly freed. Support for dynamic memory allocation is part of the Standard C Library. For Project 8, however, you will write your own versions of these functions.
A dynamic memory allocator maintains the memory associated with the
heap. The most basic task will be to keep track of which bytes of the
heap are currently in use (i.e., are allocated) and which bytes
are not in use (i.e., free). On real systems, the heap can grow
with the needs of a process. For this assignment, however, we will
consider the heap to be a fixed-size block of memory. The
functions you implement will take a void *heapptr
argument that points to the beginning of the heap that you must
manage. (Of course the heap here being pointed to
by heapptr
is a simulated version of the actual heap that
the C Standard Library manages.) Your dynamic memory allocator will
consist of the following four functions, which are declared
in heaplib.h
and will be defined (by you)
in heaplib.c
:
int hl_init(void *heapptr, unsigned int heap_size)
Sets up a new heap beginning at
heapptr
and of sizeheap_size
(in bytes). Note that yourhl_init
function does not actually need to allocate or create the heap. It is given a block of memory of the correct size, starting at the location pointed to byheapptr
. (When it comes time for you to test your code, the test cases we provide will show you a few ways in which this block of memory can be created.) Thehl_init
function should be called once before there are any calls tohl_alloc
,hl_release
, orhl_resize
.Your memory allocator must live entirely inside the heap it is managing. This means that all meta-data (pointers, size fields, flags indicating free or in-use) must all live in the heap itself. You may not use any global arrays, variables, or structures. A good test of this is whether your implementation supports multiple
heapptr
s. Your code should naturally support multiple heaps.Do not assume that
heapptr
is 8-byte aligned.If
heap_size
is not large enough to hold the meta-data required to manage the heap (as you have designed it), setup fails, and the function returns 0. Returns non-zero if successful.
void *hl_alloc(void *heapptr, unsigned int block_size)
Allocates a block of memory of size
block_size
bytes from the heap pointed to byheapptr
.Returns a pointer to the block of memory on success; returns 0 if the allocator cannot satisfy the request.
Blocks of memory can be of size 0. The returned address should be aligned at multiples of 8 bytes. The function is allowed to allocate more than the requested size, but never less. The memory "allocated" by this function does not need to be zeroed out.
void hl_release(void *heapptr, void *blockptr)
Frees the block of memory pointed to by
blockptr
(that currently resides in the heap pointed to byheapptr
). Ifblockptr
has the value 0, the function should do nothing. (If it has some non-zero value that was never the return value of a call tohl_alloc
orhl_resize
, the program behavior is undefined. You need not fail gracefully.) For utlization purposes, released memory should be able to be allocated again by subsequent calls tohl_alloc
.The memory released by this function does not need to be zeroed out.
void *hl_resize(void *heapptr, void *blockptr, unsigned int new_block_size)
changes the size of the memory block pointed to byblockptr
(that currently resides in the heap pointed to byheapptr
) from its current size to sizenew_block_size
bytes, returning a pointer to the new block, or 0 if the request cannot be satisfied. the contents of the block should be preserved. if the location of the block changes (because it is not possible to make the block pointed to byblockptr
larger and a new, larger block elsewhere needs to be used), you can copy the contents by usingmemcpy
or castblockptr
to achar *
and copy the contents byte by byte.The new block size might be smaller than the current size of the block. As it was for
hl_alloc
, the return value should be 8-byte aligned. Ifblockptr
has the value 0, the function should behave likehl_alloc
.
Note:
Your functions hl_alloc()
, hl_release()
,
and hl_resize()
correspond to the actual C Standard
Library functions malloc()
, free()
,
and realloc()
. Nowhere in heaplib.c
should you call these three functions.
You have been given the following files as part of this assignment (the ones in bold are the files you will modify and submit):
heaplame.c
-- a very bad implementation. this is
the first file you should look at. fixing one aspect of this file is
your task for Lab 12.heaplib.h
-- contains the interfaces of the
functions you are to test in Project 7 and implement in Project 8.heaplib.c
-- this is the file you will complete for
Project 8. functions you are to test in Project 7 and implement in Project 8.ctests.c
-- a single test file with room for 13 tests.
this is the file you submit for Project 7test_correctness.c
-- the code that calls your ctestsMakefile
-- compiles 3 versions of 4 executables (see below)The Makefile
you have been given will compile the
executable: test_correctness
. The heart of this assignment is your
implementation of the functions declared in heaplib.h
. You
are implementing a library, which does not contain
a main
. The ctests.c
file uses your library; with its
own main
in test_correctness.c
. The test file is not complete
(part of the assignment will be for you to complete it!), but it is a
good start.
How you compile your code will determine whether your tests use the
implementation of heaplib.h
in heaplame.c
or heaplib.c
. Since you
will spend much more time working with heaplib.c
, this is
the default. If you type make help
you will see a
slightly longer version of the following message:
Your custom heaplib.c implementation targets: make : heaplib.c make prof : heaplib.c with profiling symbols make debug : heaplib.c with debugging symbols make print-debug : heaplib.c with debugging symbols and PRINT_DEBUG The lame implementation targets: make lame : heaplame.c make lame-prof : heaplame.c with profiling symbols make lame-debug : heaplame.c with debugging symbols make lame-print-debug : heaplame.c with debugging symbols and PRINT_DEBUG ...
make
or make
lame
produces "production level" versions of all four
executables. This is how we will compile your code when we test it.make debug
or make lame-debug
) is an executable that contains all
the symbols (variable and function names, line numbers, etc.) required
to run your code in gdb.make print-debug
or make lame-print-debug
) is an executable that has both
debugging symbols and has defined the PRINT_DEBUG macro that will
enable all the debugging print statements in your code.Go ahead and peek at the Makefile. It doesn't bite and is fantastically documented. (Look, but don't touch!)
We have provided you with a lame implementation of the assignment
in heaplame.c
.
It begins by defining a structure called block_header_t
as
follows:
typedef struct _block_header_t { unsigned int max_size; unsigned int curr_size; bool in_use_f; } block_header_t ;
This implementation turns the entire heap into a single block that can
be allocated just once. The beginning of the heap is reserved as a
header to this block and maintains information about the block -
there are three fields in this header. The first field holds the size
of the heap. This field will be set by hl_init
and never
changed again. The second field indicates how many bytes the current
block is sized to. The third field states whether the lone block is
in use.
Let's step through this lame implementation of some of the
functions. Figures 1 and 2 show the state of memory before and after a
call to hl_init
with a heap of size 32 beginning at
address 0x100:
Figure 3 shows what a subsequent call to hl_alloc
with
a block size of 4 bytes would change the heap:
Figure 4 shows that the allocated block has been filled with data 'abcd'. Figure 5 shows how a call to resize would change the heap:
These diagrams are not intended to show you the correct behavior of the functions you should implement. Instead they show you an accurate depiction of what is currently implemented so that you may fully understand the code you have been given. Notice, for example, that the current implementation does not allow you to allocate the rest of the heap for new blocks. Furthermore, it doesn't even check if the requested block can even fit in the heap. This is a huge flaw in this approach from a utilization standpoint.
Known Correctness Bugs. In addition to being really unfriendly (allowing only one block to be allocated!), the lame solution
in heaplame.c
is incorrect in several respects:
heapptr
is assumed to be 8-byte aligned
.... and possibly more???
Although the implementation is woefully inadequate as a solution to the assignment, it provides you important examples of heap management, casting, pointer arithmetic, and debugging.
hl_init
and hl_alloc
. Notice that
casting the heapptr
to a block_header_t *
is
a straight forward and very readable way to set and access the fields
of the otherwise unspecified heap. (Aside: Casting can be a very
helpful way to peek at memory. Want to know whether a machine is
little or big endian? Write a number into an integer, create a pointer
to the integer, cast the pointer to a char *
and
dereference it. Now you're looking at the byte of the integer in
memory at the lowest address!)print_block_header
function. Although this function may be
incredibly useful to a programmer when implementing and debugging,
this is not the kind of information that should be printed to the
screen when some program includes and makes calls to
your heaplib.c
library. Novice programmers notoriously
litter their code with print statements that are commented in and out
as they code. This is not good form, especially since some straggling
printf
's almost always remain in the final version of the
code. This is particularly devastating when performance matters (which
it does for this assignment).
One solution to this problem is to create a variable such
as debug_mode_f
(_f
often indicates a
flag in C) and put the print statement inside an if
statement
that checks the variable:
if (print_debug_f) print_heap(heap);
This solution has the nice property that (if tied to a command line
option) the flag can be turned off and on each time you run the
program. The problem with this approach is that even your
"production-level" code will be littered with if
statements that always evaluate to false.
The code you have been given solves the problem by placing the print statement as the controlled text in a conditional group:
#ifdef PRINT_DEBUG print_heap(heap); #endif
The preprocessor (part of the compiler) will only insert the
controlled text if the PRINT_DEBUG
has been defined/set.
This Macro approach requires re-compiling in order to insert the
print statements. This may not be the best strategy for every
scenario, but it is for this assignment. Do not use any print
statements inside your heaplib.c
that are not wrapped in
an #ifdef
flag. Compiling and Running
Your Code discussed how to turn the flag on and off at compile
time.
Pointer arithmetic. To implement this assignment, you will want to use pointer arithmetic. As a basic example, see the following snippet of code:
// warning: this code is buggy int hl_init(void *heapptr, unsigned int heap_size) { block_header_t *heap = (block_header_t *)heapptr; heap->next_free = heap + sizeof(block_header_t); ...
This code initializes the next_free
pointer to point to
the first free byte, which lives past the header. A careful programmer
will not assume to know the size of all types on any machine that
their code might run on. (There might not even be one right answer!)
Instead, we use sizeof
here. The code above, however, is
still flawed. Because heap
is of type block_header_t
*
, adding some number to it, say 16, will actually move the
pointer forward "16 block_header_t
s worth" (in other
words 16 x sizeof(block_header_t)
). Two correct versions
would be:
heap->next_free = heap + 1;
(move ahead 1 block_header_t
worth) or
heap->next_free = (char *)heap + sizeof(block_header_t);
(temporarily cast heap
to be a char *
so
that adding 16 just adds 16 char
s to the address.)
If you did not understand the problem in the above paragraph or why both solutions work, you are not ready to begin this assignment! Go brush up on your pointer arithmetic.
Using Macros. Given the amount of pointer arithmetic
required for this assignment, you may find your code littered with
many small but hard-to-read lines of code that are performing
essentially the same task repeatedly. For example, you may find that
you frequently add some number of bytes to an address that needs to be cast to a
char *
. For readability, you may be tempted to create
short helper functions for these tasks. Knowing what you now know
about the overhead of a function call and knowing that your code will
be tested for speed, you should instead
use macros. A macro is a <keyword, definition>
pair. Any reference to the keyword will be replaced by the
definition at compile time. You may be familiar with simple macros such as
#define ALIGNMENT 8
which is a great way to rid your code of magic numbers. Macros can also be more complex, taking arguments. For example:
/* Useful shorthand: casts a pointer to a (char *) before adding */ #define ADD_BYTES(base_addr, num_bytes) (((char *)(base_addr)) + (num_bytes))
We recommend using macros for any complicated but simple tasks; your code will be much more readable, maintainable, and debuggable at no performance cost.
Using sizeof
. Notice that nowhere in this code are the
sizes of types assumed to
be known. Different types are of different sizes in different
environments. We will test your code on the course VM. Even still,
your library code should have no "magic numbers" that make assumptions
about sizes, even if they are correct on the course VM. One solution
is to call sizeof()
throughout your
program. uintptr_t
in <stdint.h>
is
guaranteed to contain the size of pointer on the machine where the
executable is running. You may find this useful.
Ternary Operators. You will also notice that a ternary operator is used in the print_block_header
function of heaplame. This is very useful as it allows you to have a conditional output without explicitly writing an if statement.
block->in_use_f ? "Yes" : "No"
The value before the ?
is a boolean expression, and if true, evaluates to "Yes"
, otherwise it evaluates to "No"
.
Note that the ternary operator has very low precedence, so be sure to wrap your ternary operator usage in parentheses! i.e.
(x ? 1 : 2)
instead of
x ? 1 : 2
Student implementations of dynamic memory allocators are notoriously ugly and impossible to debug. Yours should not be. First, because you will make your own life a living hell. Second, because we will deduct style points for anything egregious. We have given you sample code with all sorts of tricks in it because these are tricks that expert programmers know about and use regularly. Now that you have been shown these techniques, you will be expected to use them.
Cost of a ridiculously long project writeup? 15 minutes. Shedding your status as a novice and being able to produce code that is readable, debuggable, and super fast? Priceless.
heaplame.c
Fix problems 1, 2, and 3 listed above and any other correctness
problems which you are now able to detect by running
test_correctness
. Do not change the definition of
block_header_t
. Do not create any new typedefs. Once your fix is
working, submit it on CMS.
Fixing the heap utilization problems (the orphaned memory
resulting from hl_resize
and hl_release
)
would require a significantly different approach than the lame
implementation. You will not be asked to fix this problem. Instead,
you will be asked to come up with a completely new approach to a
dynamic memory allocator--one that is robust enough to accomplish the
task.
We will try to give you early feedback about Task #1 and Task #2 sometime after Prelim 2. You will be allowed to re-submit these files based on this feedback. Exact dates and times to be announced later.
Testing is a huge part of writing good code and will be a significant part of your grade for this assignment.
ctests.c
is intended to be a thorough test for the
correctness of any heaplib implementation. At the moment, however, it
only tests three things: whether hl_init
works, whether
hl_alloc
fails when there is not enough space, and whether the
pointer returned by hl_alloc
is 8
byte aligned. There are many more correctness problems
in heaplame.c
. Modify ctests.c
to catch
all of them.
The goal is to convince yourself that your code is
correct. Try to come up with a number of cases that really stress your
allocation functions. It's a good idea to test corner cases as
well. For example, create a test that uses two different
heap
pointers to two different chunks of memory. Make
sure that a user is never given permission to write past
the heap
associated with any allocation request. (Note: a
user could always write past the heap
, but if a user
requests 32 bytes and is given a pointer to the very last byte in
the heap
, this is a correctness problem. Similarly, a
user should never be given permission to write into memory that was
already reserved (and not freed) in a previous call
to hl_alloc
. Think about how you might test whether data
in a block is preserved over time.
You should maintain the structure of the program
testDescriptions
If you wish to add additional print statements, please use the
PRINT_DEBUG
macro.
When you have completed ctests.c
it should report all
the correctness problems with heaplame.c
. You will
submit this file to CMS. This test will be graded on how thoroughly it
tests the correctness of yours and our broken implementations of
heaplame.c
and heaplib.c
. Points will be
deducted for false positives (finding errors where there are none) and
false negatives (not finding errors when they are there).
What is correctness? For the purposes of this assignment,
any error that would occur even in an infinite heap is considered a
correctness error. For example, not guaranteeing pointer alignment is
a correctness error. Not implementing hl_release
will be
considered a utilization error, not a correctness error.
A thorough correctness test will not only earn you points and glory, but it will be invaluable in finishing subsequent tasks for this assignment.
heaplame.c
back and forth. You will benefit immensely
if you allow your implementation to inform your testing and vice
versa.gdb
, the GNU debugger, helpful while
completing this assignment. (Tutorials can be found
on unknown
road). At the very least, the make lame-print-debug option will give you some insightheaplame.c
and then type make
and then ./test_correctness
it may look as though your change
did nothing. In fact, you should have typed make lame
and the change in your code was simply not incorporated into your
executable. This is just one example of the ways in which you may
occasionally confuse yourself. Take a step back and ask
yourself What am I compiling? What am I running?When you have a version of ctests.c
that compiles, please uplaod it
to CMS. We will give you feedback on your tests before the Tuesday
deadline.
Coming soon!
This assignment is the literal descendant of an assignment originally written by Robbert Van Renesse at Cornell. It is the spiritual descendant of the textbook "Computer Systems: A Programmer's Perspective" by Bryant and O'Hallaron, which your instructor thinks is a fantastic book.