Group Project 8 - Implementing a Dynamic Memory Allocator


CS3410 Fall 2016

Due: 4:30pm, Wednesday, December 7, 2016


Note: you will be automatically paried with your partner from Project 7.

In this assignment, you will implement a dynamic memory allocation library that supports the allocation, freeing, and resizing of blocks of memory. Your code must follow both the interface and the functional specifications which were outlined in Project 7. In all other ways, however, you should be creative: design your own data structures; put your problem-solving skills to work! You are free to implement existing/known approaches to solving this problem or come up with your own original ideas. As you consider your options, you should under no circumstances look at others' code (including code on the internet) or show your code to anyone outside of your group of two.

Assignment Navigation

Designing Your Own Memory Allocation Library

Now that you have completed your correctness tests, you are ready to begin the heart of this assignment. Think about the ways in which the lame implementation (in the lab) was deficient. This is your chance to come up with a better strategy to manage your heap.

Your implementation will be graded based on correctness first. Correct implmentations of heaplib.c will then be evaluated on both memory utilization and throughput. (Incorrect implementations will not be tested for utilization and throughput--and will forgo the points associated with these tests--because these metrics are meaningless in the context of an incorrect program. (Example: if you implemented hl_alloc by simply returning NULL your performance would be great, but what good would it do?)

Utilization. When a heap is initialized, it is given a fixed memory area that it can use to fulfill allocation requests. (Note: the actual heap on a real system can grow as needed within the virtual address space of the process. The fixed-sized heap in this assignment is an artificial but simplifying restriction.)

Throughput. Another important aspect of your design will be how quickly your library completes the jobs associated with allocating, freeing, and resizing memory. Different approaches may improve throughput but either require meta-data that might increase overhead or fragment the heap. You will have to consider these tradeoffs when designing your approach knowing that both metrics will part of your grade.

You will have a lot of design choices to make that create a tradeoff between heap utilization and throughput. Here are just a few:

Implicit vs. Explicit Lists. One design choice you will have is whether the list of free and allocated blocks that you maintain is an implicit list or an explicit list. In an implicit list, the size field is enough to help you find the next block. All blocks (both free and allocated) will be in the same list. In an explicit list, pointers determine which block comes next. Now you can have the blocks in any order. This strategy support multiple lists, doubly linked lists, ordered lists, and even trees.

Allocation Strategies. Think about the pros and cons of some common allocation strategies. A first-fit approach always gives the user the first memory block in the region that is big enough to fulfill the request. This provides the fastest response time, but is not necessarily the most space-efficient use of your memory region. A best-fit approach gives the user the smallest memory block that fulfills the request. The allocator will not be as quick because it might search the entire list to find this smallest block, but this is a more efficient use of space. You might be able to speed up this search by maintaining pointers to free memory blocks of certain sizes. You will have to decide whether these efficiency techniques are worth the space overhead that they require.

Fragmentation. After responding to many calls to hl_alloc and hl_release the heap could become chopped up into small interwoven blocks of allocated and freed memory. When your memory region becomes fragmented, the sum of the free blocks of memory might satisfy an hl_alloc request, but since they are not contiguous, the request will fail. Whether you implement a first-fit or best-fit allocation strategy is just one way in which you control the fragmentation of your memory region. You should also think about how your implementations of hl_release and hl_resize play a role in your ability to avoid fragmentation and satisfy future allocation requests.

Implementing Your Memory Allocation Library

With all of the information above and any research you care to do on your own (that does not involved looking at code), come up with a heap management strategy and implement it in heaplib.c. Remember that you are not allowed to change the interface to any of the library functions. Any changes you make to heaplib.h will be ignored when we grade your code. (CMS will not even accept heaplib.h as part of your submission.) Remember that correctness is your first priority.

As you implement your ideas, you should regularly run test_correctness to catch any correctness problems. Catching bugs as they appear is much easier than catching them all at once at the end! You may want to add more test cases to your ctests.c file. Remember to test not only the specs of the assignment but also any invariants of your specific approach. For example, if you implement a balanced tree, you might want to add a test in ctests.c that checks to see whether your tree remains balanced at all times.

Optimize Your Code

Do not begin this task until you have a correct, working version of your library.

Hopefully your dynamic memory allocation library was written with attention to speed. This task will ask you to profile your code and potentially improve your performance score by finding out exactly where your code spends most of its time. For this task, you will make a version of your executable that has been modified to support profiling. You will use a program called gprof which will repeatedly interrupt the execution of your executable and make a note of which function the executable is currently in. With enough samples, it will report to you the amount of time your executable is spending in each of its functions.

For your convenience, we have provided you with two new source files (test_throughput, ttests.c) and an updated Makefile on CMS.

Note: You will be asked to report on this task and how you improved the performance of your code in response to profiling it. The way to prove that your code got faster based on your profiling and code optimization efforts is to show before and after test_throughput results. So the process is as follows:

  1. Run test_throughput and keep a copy of the results.
  2. Make the profiling-enabled versions of your executable:
    make prof
    
  3. Now you can profile any of your executables:
    ./test_throughput 1
    
    This will run the executable and generate the file gmon.out. Now run gprof:
    gprof test_throughput
    
    to see where you code is spending its time. Note: if the ttest is not long enough, gprof will not be able to gather timing information. At the top of the output you will see "no accumulated time". Go make your ttest longer and try again. (Remember to rerun the un-optimized test_throughput (Step 1) if you ever change your test.
  4. Now that you see where your code is spending most of its time, try to optimize it in some way. Your optimization could be related to your heap management. It could also be code optimization that has more to do with C or your new-found knowledge of processors and performance bottlenecks. Remember to continue to compile with optimizations set to -O0. Turning on compiler optimizations does not count!
  5. Run test_throughput (not compiled for profiling!) again. You now have a "before" and "after" version to report on in the next step.

This is an exercise to show that you know how to both profile your code and measure the effects of your improvements. Full credit will be given to anyone who successfully follows these steps even if you are unable to make a remarkable leap in performance. We don't want you arbitrarily making your slow up front, and we know you can't always squeeze blood from a stone.

Upload a pdf to CMS that shows the before throughput results of your code, the output from profiling your code (just the percentages, please), and the after throughput results of your code. In 2-3 sentences, explain how you attempted to optimize your code based on the profiling and whether that was successful. You will not be penalized if you are not able to make your code faster, but you need to show a good faith effort that you looked into the problem and explored some options.

The Leaderboard

Once again, the autograder will run your code 5 times per day. We will run your code and test it for correctness. Code that passes our correctness tests will then be tested for overhead, maintainence, and performance. At some point during the following week we will set up a leaderboard that shows how your code fares on these three tests compared to other groups in the class. We don't actually expect one group to be at the top of all three lists. Instead, we hope you will be able to see from this exercise that design choices are full of tradeoffs, often sacrificing one metric for another.

What to submit

Due: 4:30pm, Wednesday, December 7

Complete and submit the following 7 files to CMS:

Grading

This project will be graded first on the correctness of your malloc implementation. Afterwards, you will receive points based on how well your implementation utilizes the given heap and the speed at which it is able to handle requests.

A rough breakdown of your grade is as follows:

Note that you will not receive points for utilization and speed if your implementation does not pass the correctness tests!


Acknowledgements

This assignment is the literal descendant of an assignment originally written by Robbert Van Renesse at Cornell. It is the spiritual descendant of the textbook "Computer Systems: A Programmer's Perspective" by Bryant and O'Hallaron, which your instructor thinks is a fantastic book.