Final Project - A Dynamic Memory Allocator

CS 3410 Fall 2018

Schedule Design Doc Meeting by: Saturday, December 1, 2018

Design Doc Meetings: December 2-4, 2018

Due: 4:30pm, Monday, December 10, 2018

There are no slip days or grace period for this deadline.

This assignment teaches you about dynamic memory allocation--the allocation, resizing, and freeing blocks of memory. There are two main parts to this assignment. First, you will write a suite of tests that can check whether an implementation of a dynamic memory allocation library is correct. Then you will write your own implementation of a dynamic memory allocation library. Your implementation will be assessed based on its correctness (according to the specification below) its utilization of the memory it manages, and performance. Finally, we will be comparing correct implementations of malloc to one another in a competition for the best malloc!

What to submit

tests.c: your test suite
heaplib.c: your implementation of malloc
alias.txt: 1-line file containing your Leaderboard Alias

Overview

A dynamic memory allocator maintains the memory associated with the heap. Dynamically allocated memory is allocated on the heap and exists from the moment it is allocated until it is explicitly freed. Support for dynamic memory allocation is part of the Standard C Library. Its most basic task is to keep track of which bytes of the heap are currently in use (i.e., are allocated) and which bytes are not in use (i.e., free). On real systems, the heap can grow with the needs of a process. For this assignment, however, we will consider the heap to be a fixed-size block of memory. Your dynamic memory allocator will always be associated with a void *heap that points to the beginning of the heap that you must manage. (The heap pointed to by heap is a simulated version of the actual heap that the C Standard Library manages.)

A dynamic memory allocator is a bit like an actual library (of books). When someone asks for 512 bytes of memory, the allocator finds 512 free contiguous bytes of memory for the caller and returns the address of the first of these 512 bytes. These 512 bytes are now "checked out" and cannot be given out by any subsequent requests until they are explicitly freed. The bytes are freed when the user explicitly tells the allocator it no longer needs the bytes beginning at that address.

For this assignment, it will be important to understand not only what it means to correctly implement a dynamic memory allocator, but also what it means to correctly use a dynamic memory allocator. If at any point your code seg-faults, you must know whether the fault lies in the library or the user of the library. For example, a user should never write past the fixed-size heap. If they do, the problem could be that the user has written past the chunk of memory they were given (this is the fault of the user) or it could be that the memory allocator gave the user a block of memory which did not actually reside entirely inside the heap (this is the fault of the allocator). To help you learn this distinction, we have you write tests that use and test the dynamic memory allocator. Once you have a test suite that you can trust, you can use it to test your own implementation.

Remember the best way to implement a large task is to break it up into very small, testable subtasks. Code, test, code, test. The tests you have written should help you with your implementation. Also, testing immediately after you have implemented something will help make your tests better.

The Specs

Your dynamic memory allocator will consist of four functions, which are declared in heaplib.h and will be implemented (by you) in heaplib.c. The functions hl_alloc(), hl_release(), and hl_resize() correspond to the actual C Standard Library functions malloc(), free(), and realloc(). You should use your understanding of these functions to guide your implementation, but nowhere in heaplib.c should you call these three functions.

Remember: If preconditions are violated, non-graceful failure is acceptable.

`void hl_init(void *heap, unsigned int heap_size)`

Initializes a new heap beginning at heap of size heap_size (in bytes).
does not allocate the heap. That should be done already by whomever calls this function. (tests.c shows you how to allocate a heap before calling hl_init.)
All meta-data (pointers, size fields, flags indicating free or in-use) that maintain the heap must live in the heap itself. You may not use any global variables or structures. (Defining new structs is fine, having them live in the global space is not.) Every variable or structure must live inside either the heap or in the local function being executed.
should work for any heap size >= 1024 bytes
PRECONDITIONS:

the allocated heap must be heap_size in bytes

heap_size must be >= MIN_HEAP_SIZE (here, 1024)

function should be called exactly once for each heap being managed

`void hl_alloc(void heap, unsigned int block_size)`

Allocates a block of memory of size block_size bytes from the heap starting at heap.
Returns a pointer to the block on success; returns 0 (NULL) if the allocator cannot satisfy the request.
the returned address should be aligned at multiples of 8 bytes, but heap is not required to be 8-byte aligned.
may allocate more than the requested size, but never less
memory "allocated" does not need to be zeroed out
Requested blocks of memory can be of size 0.
PRECONDITIONS:

hl_init must have been called exactly once for this heap

`void hl_release(void heap, void block)`

releases the block of memory pointed to by block (which currently resides in the heap pointed to by heap)
acts as a NOP if block == 0 (NULL)
released memory should be able to be allocated again by subsequent calls to hl_alloc
memory released by this function does not need to be zeroed out
PRECONDITIONS:

block must have been returned to the user from a prior call to hl_alloc or hl_resize

hl_release can only be called ONCE in association with that prior call to hl_alloc or hl_resize

`void hl_resize(void heap, void *block, unsigned int new_size)`

Changes the size of the block pointed to by block (that currently resides in the heap pointed to by heap) from its current size to size new_size bytes, returning a pointer to the new block, or 0 (NULL) if the request cannot be satisfied
contents of the block should be preserved (even if the location of the block changes -- this will happen when it is not possible to increase the size of the current block but there is room elsewhere on the heap to satisfy the request).
If block has the value 0 (NULL), function should behave like hl_alloc.
the new block size can be smaller than the current size of the block
the return value should be 8-byte aligned
FYI : You can copy the contents of a chunk in memory by using memmove or cast block to a char * and copy the contents byte by byte.
PRECONDITIONS:

block must have been returned to the user from a prior call to hl_alloc or hl_resize

Compiling and Running Your Code

You can find the files for this project in your Github repository. (See these git tips.) You have been given the following files; the ones in bold are the files you will modify and submit, the others you may modify at your own peril (since you don't submit them, don't expect us to have your version of them!):

heaplib.h -- the interfaces of the functions to test and implement
heaplib.c -- your implementation
tests.c -- your test suite
test_heaplib.c -- the code that calls your tests
Makefile -- compiles 3 versions of 4 executables (see below)
heaplame.c -- a very bad implementation of a dynamic memory allocator. It has a few bugs, and you may fix them if you would like. See the heaplame section for a case study of how it works.
heaplesslame.c -- a copy of heaplame.c. If you are at all tempted to fix the bugs in heaplame.c fix them here so that you can keep the original on hand. This is a good way to play around with an implementation before diving into your own.

The Makefile you have been given will compile the executable: test_heaplib. The heart of this assignment is your implementation of the functions declared in heaplib.h. You are implementing a library, which does not contain a main. The tests.c file uses your library; with its own main in test_heaplib.c. The test file is not complete. Finishing it is your first task!

How you compile your code will determine whether your tests use the implementation of heaplib.h in heaplame.c or heaplib.c. Since you will spend much more time working with heaplib.c, this is the default. If you type make help you will see a slightly longer version of the following message:

Your custom heaplib.c implementation targets:
  make                   : heaplib.c
  make debug             : heaplib.c with debugging symbols
  make print-debug       : heaplib.c with debugging symbols and PRINT_DEBUG

The lame implementation targets:
  make lame             : heaplame.c
  make lame-debug       : heaplame.c with debugging symbols
  make lame-print-debug : heaplame.c with debugging symbols and PRINT_DEBUG

The less lame implementation targets:
  make ll             : heaplesslame.c
  make ll-debug       : heaplesslame.c with debugging symbols
  make ll-print-debug : heaplesslame.c with debugging symbols and PRINT_DEBUG
...

As you can see, the code base has a lot of programs and a lot of targets. Be sure you are making and executing the correct target and executable at all times!

The first target (built by typing make or make lame produces "production level" versions of both executables. This is how we will compile your code when we test it.
The second target (built by typing make debug or make lame-debug) is an executable that contains all the symbols (variable and function names, line numbers, etc.) required to run your code in gdb. You may find gdb, the GNU debugger, helpful while completing this assignment. (Tutorials can be found on unknown road).
The third target (built by typing make print-debug or make lame-print-debug) is an executable that has both debugging symbols and has defined the PRINT_DEBUG macro that will enable all the debugging print statements in your code.

Go ahead and peek at the Makefile. It is fantastically documented. You do not need to modify the Makefile for this assignment. (If you choose to modify it, keep in mind that we will use the original when we test your code.)

Case Study

Heaplame - the lame functionality ... (click to expand/shrink)

We have provided you with a really lame implementation of the assignment in heaplame.c. This implementation shows you how a simple dynamic memory allocator might work, but beware: (1) this implementation is buggy. In several respects, it does not follow the specs above and (2) this approach is overly simplistic. The overall approach of heaplame.c is to keep track of only N_SUPPORTED_BLOCKS:

#define N_SUPPORTED_BLOCKS 5

This means that the user may only allocate 5 blocks at a time. (Pretty lame, huh?) The information for each block (the size and the starting address of the block) is stored in a newly defined structure called block_info_t which is defined as follows:

typedef struct _block_info_t {
    unsigned int block_size;
    void *block;
} block_info_t;

This block information is stored in an array in a containing structure called block_header_t:

typedef struct _heap_header_t {
    unsigned int heap_size;
    block_info_t blocks[N_SUPPORTED_BLOCKS];
    bool in_use_f[N_SUPPORTED_BLOCKS];
} block_header_t ;

This implementation dedicates the beginning of the heap to the metadata (which is initialized in hl_init), and the remaining portion of the heap to the 5 blocks allocated by the user.

In the beginning (see Figure 1) there is an array which the user has created (either by statically allocating it on its stack or by calling malloc). This is the area of memory which simulates the heap. If you look in heaplib.h, you will see the following definition:

#define MIN_HEAP_SIZE 1024

Let's step through the implementation of some of the functions. Figure 1 shows the state of memory before a call to hl_init with a heap of size 1024 beginning at address x100:

After the call to hl_init, the meta data has been initialized and the heap looks as shown below in Figure 2:

Figure 3 shows what a subsequent call to hl_alloc with a block size of 512 bytes would change the heap:

Figure 4 shows how a call to resize would change the heap:

These diagrams are not intended to show you the correct behavior of the functions you should implement. Instead they show you an accurate depiction of what is currently implemented so that you may fully understand the code you have been given. In fact, the code you have been given is woefully incomplete and incorrect. You should take some time to look at the implementation, run it through a few of the given tests, and write your own tests that expose the ways in which the code is broken or inadequate.

Known Correctness Bugs. In addition to being really unfriendly (allowing only five blocks to be allocated!), the lame solution in heaplame.c is incorrect in several respects:

no functions return pointers that are 8-byte aligned
heap is assumed to be 8-byte aligned
not all functions behave correctly when given null block pointers

.... and possibly more???

Although the implementation is not a working solution, it provides you important examples of heap management, casting, pointer arithmetic, and debugging.

Fixing all of the problems in this implementation would require a significantly different approach than the lame implementation. You are encouraged to spend some time thinking about how to debug this code (and to place your fixes in heaplesslame.c for sanity's sake, but it should not be viewed as the baseline for your own implementation, which should be more thoughtful and sophisticated. You will need to come up with a completely different approach to a dynamic memory allocator--one that is robust enough to accomplish the task.

Heaplame - the cool coding lessons ... (click to expand/shrink)

Casting. Take a look at the implementations of hl_init and hl_alloc. Notice that casting the heap to a heap_header_t * is a straightforward and very readable way to set and access the fields of the otherwise unspecified heap.
print_debug functions. Look at the function print_debug_entering_init. It only prints "Entering hl_init()" to the screen. Although this function may be useful to a programmer when implementing and debugging, it is not the kind of information that should be printed to the screen when a program includes and makes calls to your heaplib library. Novice programmers notoriously litter their code with print statements that are commented in and out as they code. This is not good form, especially since some straggling printf's almost always remain in the final version of the code. Print statements can be particularly devastating during a long run -- at best they make the code painfully slow; at worst they fill up the hard drive and make the computer unusable.

One solution to this problem is to create a variable such as debug_mode_f (_f often indicates a flag in C) and put the print statement inside an if statement that checks the variable:

if (print_debug_f) {
    print_heap(heap);
}

This solution has the nice property that (if tied to a command line option) the flag can be turned off and on each time you run the program. The problem with this approach is that even your "production-level" code will be littered with if statements that always evaluate to false.

The code you have been given solves the problem by placing the body of the print statement as the controlled text in a conditional group:

void print_debug_entering_init() {
#ifdef PRINT_DEBUG
    printf("Entering hl_init()\n");
#endif
}

The preprocessor (part of the compiler) will only insert the controlled text if the PRINT_DEBUG has been defined/set.
This Macro approach requires re-compiling in order to insert the print statements. If you didn't want to create a separate function, you could also simply wrap the printf directly:

#ifdef PRINT_DEBUG
  printf("the value of x is %d\n", x);
#endif

This may not be the best strategy for every scenario, but it is good for this assignment. Do not use any print statements inside your heaplib.c that are not wrapped in an #ifdef flag. Compiling and Running Your Code discussed how to turn the flag on and off at compile time.

Pointer arithmetic. To implement this assignment, you will need to understand pointer arithmetic. As a basic example, see the following snippet of code:

int array[4];
int *ptr = array + 1;

What do you think ptr points to? Not sure? Step through the code in the C Tutorial! When you add 1 to the pointer, the unit of addition is int, meaning that you're taking the base address of array, and saying "go one integer later". The 1 is actually 1 integer, or 4 bytes.

When you are manipulating pointers in this assignment, you may often want to add "raw bytes". For example, if you want to get the address that is 16 bytes into your heap, which you've already cast to be a heap_header_t *, you would need to write:

heap_header_t *header = (heap_header_t *)heap;
void *sixteen_later = ((char *)header)+16;

Because we suspect you might want to do addition like this a lot, we've provided you with a simple #define:

#define ADD_BYTES(base_addr, num_bytes) (((char *)(base_addr)) + (num_bytes))

which you can could then use as follows:

heap_header_t *header = (heap_header_t *)heap;
void *sixteen_later = ADD_BYTES(header, 16);

C is not the same everywhere! Do not assume that you know the size of all variable types in C. An int might be 4 bytes on one machine and 8 bytes on another. This is one reason why it is VERY important to use the VM or the Linux machines we have provided for this class for this assignment. If you never run your code on the machines we test them on, you may be in for a horrible, seg-faulty surprise. It is always a good idea to use sizeof() instead of assuming you know the size of any variable or type in your code. Alternatively, uintptr_t in <stdint.h> is guaranteed to contain the size of a pointer on the machine where the executable is running.

Another aspect of C is that the compiler will align your structures for you. How it performs the alignment varies not only by machine but also by operating system, so once again, do not make any assumptions. As an example, look at the definition of heap_header_t.

typedef struct _heap_header_t {
    unsigned int heap_size;
    block_info_t blocks[N_SUPPORTED_BLOCKS];
    bool in_use_f[N_SUPPORTED_BLOCKS];
} heap_header_t ;

How large is this structure? hl_alloc calculates the size of the header to be:

sizeof(unsigned int) /* heapsize */
+ sizeof(block_info_t) * N_SUPPORTED_BLOCKS /* block info */
+ sizeof(bool) * N_SUPPORTED_BLOCKS /* in use */

However, if you simply said sizeof(heap_header_t) you might get a different answer because the size of the structure is typically rounded to a size that is divisible by 4 (or sometimes 8). This is one of the reasons we require you to keep your block pointers 8-byte aligned. (This is also one of the reasons we did not put the in_use_f flags inside the block_info_t structure.) You should decide exactly how you want to pack your data in the heap. It is important to understand structure alignment so that as you make these decisions you understand how to implement them.

Using Macros. A macro is a <keyword, definition> pair. Any reference to the keyword will be replaced by the definition at compile time. You may be familiar with simple macros such as:

#define ALIGNMENT 8

which is a great way to rid your code of magic numbers. Macros can also be more complex, taking arguments. For example:

/* Useful shorthand: casts a pointer to a (char *) before adding */
#define ADD_BYTES(base_addr, num_bytes) (((char *)(base_addr)) + (num_bytes))

We recommend using macros for any complicated but simple tasks; your code will be much more readable, maintainable, and debug-able at no performance cost.

Ternary Operators. You will also notice that a ternary operator is used in the print_block_header function of heaplame. This is very useful as it allows you to have a conditional output without explicitly writing an if statement.

block->in_use_f ? "Yes" : "No"

The value before the ? is a boolean expression, and if true, evaluates to "Yes", otherwise it evaluates to "No".

Note that the ternary operator has very low precedence, so wrap your ternary operator usage in parentheses! i.e.,

(x ? 1 : 2)

instead of

x ? 1 : 2

Student implementations of dynamic memory allocators are notoriously ugly and impossible to debug. Yours should not be. First, because you will make your own life a living hell. Second, because we will deduct style points for anything egregious. We have given you sample code with all sorts of tricks in it because these are tricks that expert programmers know about and use regularly. Now that you have been shown these techniques, you can use them to your advantage.

Cost of a ridiculously long project writeup? 15 minutes. Shedding your status as a novice and being able to produce code that is readable, debug-able, and portable? Priceless.

Part 1: Malloc Test Suite

Although you will receive a grade for your tests, the true value of your test suite is the extent to which it helps you test your implementation in the next part.

Look in test_heaplib.c and tests.c to see how the test suite is structured. There is room for 24 tests. Tests 0-3 have been completed for you. Test 16 has been started for you but is not complete. You can run a test by typing make lame-print-debug (if you want the debug print statements) or make lame (if you don't) at the terminal and then ./test_heaplib n, where n is a number from 0 to 23.

Spec Tests (0-15)

"Spec tests" detect ways in which a heaplib implementation does not meet the given specifications. There are many ways in which heaplame is not up to spec. Your task is to write tests that detect these problems. Use the known problems in heaplame as a way to foresee and avoid possible problems that your code could experience when you implement your own library.

If it helps you to correct some of the errors in heaplame.c, you should do so in in the file heaplesslame.c, which you can compile and run by typing make ll-print-debug and then ./test_heaplib n.

Stress Tests (16-23)

"Stress tests" detect errors that might only occur after many calls to the heaplib library. This could be due to subtle bugs, corner cases, or just errors that only manifest under extreme conditions (like the heap filling up).

There are two classic problems that result from a buggy implementation of heaplib:

The first we will call a structural integrity problem. At some point a field in the heap thought to contain a pointer will no longer contain a valid address. Perhaps you have a next pointer in a linked list, but you accidentally overwrite it with the size of a block. Now you try to dereference the address 42. This will result in a segmentation fault. Bugs that corrupt the structural integrity of your heap lead to fatal errors at runtime. You can test for structural integrity problems by simply hammering the library with a bunch of calls to alloc, resize, and free. If the test finishes, it is a success. If a segmentation fault occurs, the test has failed.

The second problem we will call a data integrity problem. In this case, a buggy implementation will either give the user a bad block of memory (maybe a block that goes past the edge of the heap, for example, or a block with some overlapping region that has already been given out in a previous request to alloc that has not yet been freed) OR the library itself writes something into the block portion of the heap, which it should never do unless that block has been freed by the user. You can test for data integrity by requesting blocks from the library and then filling the block with data. After subsequent calls to the library, check and see whether the data is still intact.

TIPS:

Make sure you use legal, well-formed calls. Also, make sure that the errors are a result of the heap library, and not your test:

if you ask for n bytes, you should only write n bytes
you may not free something that was never allocated in the first place (that was already freed). if you do, the library might seg-fault and the library is not to blame.

It is recommended that you have one test that uses just alloc and free and another one that includes resize. This way you can test your both before and after you have implemented resize.
It is recommended that your stress tests have at least 100 calls to each function. Please use loops and arrays that hold pointers. The tests should not be hundreds of lines long!
It is recommended that your stress tests call for blocks of varying sizes. Structural and data corruption often manifests itself only when different blocks are requested for, not when the same chunk of memory is requested and freed repeatedly.
Our robustness tests (which we will use to grade your code) check for both structural and data integrity. However, how you write these tests is your choice.

You should maintain the current structure of the test suite:

A 1-line description of each test, ending in whether the test passes or fails when you run ./test_heaplib [test #]. This description should be put at the top of your tests.c file.
A comment before each test indicating which functions you are testing, what specification you are checking, and how an error would manifest.
Embed all print statements inside the PRINT_DEBUG macro.

You will submit tests.c to CMS which will trigger an automatic autograder run. You will receive an email with the result of your tests. We will run your spec tests on variations of heaplame that have isolated individual errors. We will run your stress tests on a suite of "broken mallocs". The more of them your stress tests flag as broken (while not flagging correct versions as broken), the better your grade.

What constitutes a good test? A good test will not flag a correct implementation as broken, but will flag a broken implementation as broken. If your tests violate the preconditions or the usage model of heaplib, they will be ignored when grades are assigned.

What constitutes flagging as broken? You can either return FAILURE from your test function, or you can have the code SEGFAULT. The second option may seem unintuitive at first, but it is helpful to realize that if an implementation of malloc is indeed broken, it will have bad behavior such as SEGFAULT.

IMPORTANT: Past students have asked for more test placeholders. This is why you have room to write 24 tests. However, you might not need all 24 test placeholders. (We do not have 24 tests or 24 broken mallocs...) Your tests will be graded based on the number of implementation errors they find, not on how many tests you have.

Part 2: Designing and Implementing Your Own Memory Allocation Library

Now it's time for you to implement your own dynamic memory allocation library. Your code must follow both the interface and the functional specifications which were outlined above. You are free to implement malloc as we suggest or use your own original ideas. As you consider your options, you may find these slides helpful (note: these slides will be updated for Prof Bracy's lecture on 11/29 after which they will be available from the class schedule page). You should under no circumstances look at anyone else's code (including code on the internet).

Utilization. When a heap is initialized, it is given a fixed memory area that it can use to fulfil allocation requests. (Note: the actual heap on a real system can grow as needed within the virtual address space of the process. The fixed-sized heap in this assignment is an artificial but simplifying restriction.)

Overhead: the first aspect of heap utilization relates to how much meta data your implementation requires. Meta-data is the information about the block, the size of each block, whether it is free, etc. Less meta data allows you to dedicate more of the heap to service alloc requests.
Management: the second aspect of heap utilization relates to the management of the heap itself. Careful implementations of alloc, resize, and free will keep your heap less fragmented and more able to fulfil future allocation and resize requests.
Allocation Strategies. Think about the pros and cons of some common allocation strategies. A first-fit approach always gives the user the first memory block in the region that is big enough to fulfil the request. This is the easiest to implement. A best-fit approach gives the user the smallest memory block that fulfils the request.
Fragmentation. After responding to many calls to hl_alloc and hl_release the heap could become chopped up into small interwoven blocks of allocated and freed memory. When your memory region becomes fragmented, the sum of the free blocks of memory might satisfy an hl_alloc request, but since they are not contiguous, the request will fail. Whether you implement a first-fit or best-fit allocation strategy is just one way in which you control the fragmentation of your memory region. Whether you support the coalescing of blocks (explained in the pdf) is another.

Throughput. Another important aspect of your design will be how quickly your library completes its tasks. Different approaches may improve throughput but either require meta-data that might increase overhead or fragment the heap. You will have to consider these tradeoffs when designing your approach knowing that both metrics will part of your grade.

You will have a lot of design choices to make that create a tradeoff between heap utilization and throughput.

The Leaderboard

Once it goes live, the leaderboard will be here.

Your team will be identified by the alias specified in your alias.txt file. The alias can consist of alphanumerics only plus _ (underscore) or - (hyphen) but no other symbols. It cannot start or end with a symbol, and must have no more than 32 characters. Should your alias not follow these rules, the 3410 TAs are more than happy to create a great name for you. Humor and despair are fine, but keep it clean and respectful.

We will run your code and test it for correctness. Code that passes our correctness tests will then be tested for overhead, heap maintenance, and throughput. At some point during the following week we will set up a leaderboard that shows how your code fares on these three metrics compared to other groups in the class. We hope you will be able to see from this exercise that design choices are full of trade-offs, often sacrificing one metric for another.

Grading

Each day the autograder gives you 3 runs for your tests and 5 runs for your implementation. Failure to compile counts as a run.

tests.c will be graded based on how many of our broken implementations of heaplame it correctly identifies as broken. Some of the errors will be caught with your specification tests, others with your stress tests. Tests that yield false positives (finding errors where there are none) will be discounted.

heaplib.c will be graded correctness first. Correct implementations of heaplib.c will then be evaluated on memory utilization and throughput. Note that you will not receive points for utilization or throughput if your implementation does not pass the correctness tests. (This is because utilization and throughput are meaningless in the context of an incorrect program. If you implemented hl_alloc by simply returning NULL your performance would be great, but what good would it do?)

A rough breakdown of your grade is as follows:

25%: Testing
55%: Correctness
10%: Utilization
10%: Speed

Acknowledgements

This assignment is the literal descendant of an assignment originally written by Robbert Van Renesse at Cornell. It is the spiritual descendant of the textbook "Computer Systems: A Programmer's Perspective" by Bryant and O'Hallaron, which your instructor thinks is a fantastic book.