Managing memory is a major part of programming in C. You have used malloc() and free() in the recent labs. You have also built a very basic memory allocator, and it is now time to build a more advanced allocator. In this assignment, you will implement a memory allocator, which allows users to malloc() and free() memory as needed. Your allocator will request large chunks of memory from the OS and efficiently manage all the bookkeeping and memory. The allocator we ask you to implement is inspired by the DLMalloc allocator designed by Doug Lea. The DLMalloc allocator also inspired the PTMalloc allocator, which GLibC currently uses. Indeed, our allocator is a simplified version of DLMalloc, but you will also notice many similarities.
We hope that the last two labs have motivated the need for dynamic memory allocators.
Specifically, we have seen that while it is certainly possible to use the low-level mmap and munmap functions to manage areas of virtual memory, programmers need the convenience and efficiency of more fine-grained memory allocators. If we managed the memory from the OS ourselves, we could allow allocating and freeing variables in any order, and also reuse memory for other variables.
The last lab taught you how best to build an implicit free list allocator for managing free blocks. In this assignment, we will first build a more efficient free list data structure called an explicit free list, and then perform a number of optimizations.
Explicit free list
The block allocation time with an implicit free list is linear in the total number of heap blocks which is not suitable for a high-performance allocator. We can add a next and previous pointer to each block’s metadata so that we can iterate over the unallocated blocks. The resulting linked list data structure is called an explicit free list. Using a doubly linked list instead of a free list reduces the first-fit allocation time from linear in the total number of blocks to linear in the total number of free blocks.
Dealing with memory fragmentation
Fragmentation occurs when otherwise unused memory is not available to satisfy allocate requests. This phenomenon happens because when we split up large blocks into smaller ones to fulfill user requests for memory, we end up with many small blocks. However, some of those blocks may be able to be merged back into a larger block. To address this issue requires us to iterate over the free list and make an effort to find if the block we are trying to free is adjacent to another already free block. If neighboring blocks are free, we can coalesce them into a single larger block.
Dealing with the edges of chunks
One detail we must consider is how to handle the edges of the chunks from the OS. If we simply start the first allocable block at the beginning of the memory chunk, then we may run into problems when trying to free the block later. This is because a block at the edge of the chunk is missing a neighbor. A simple solution to this is to insert a pair of fenceposts at either end of the chunk. The fencepost is a dummy header containing no allocable memory, but which serves as a neighbor to the first and last allocable blocks in the chunk.
Now we can look up the neighbors of those blocks and don’t have to worry about accidentally coalescing outside of the memory chunk allocated by the OS, because anytime one of the neighbors is a fencepost we cannot coalesce in that direction.
We will also perform the following optimizations as part of the assignment to improve the space and time complexity of our memory allocator.
Reducing the Metadata Footprint
- Naive solution: In our description of the explicit free list above, we assume the memory allocated to the user begins after all of the block’s metadata. We must maintain the metadata like size and allocation status because we need it in the block’s header when we free the object.
- Optimization 1: While we need to maintain the size and allocation status, we only use the free list pointers when the object is free. If the object has been allocated, it is no longer in a free list; thus, the memory used to store the pointers can be used for other purposes. By placing the next and previous pointers at the end of the metadata, we can save an additional 2 * sizeof(pointer) bytes and add that to the memory allocated to the user.
- Optimization 2: The allocated flag that tells if a block is allocated or not uses only one bit. Since the sizes are rounded up to the next 8 bytes, the last three bits are not used. Instead of using a boolean to store the allocated flag, we can use one of the unused bits in size. That will save an additional 8 bytes.
Constant Time Coalesce
- Naive solution: We mentioned above that we could iterate over the free list to find blocks that are next to each other, but unfortunately, that makes the free operation O(n), where n is the number of blocks in the list.
- Optimized solution: The solution we will use is to add another data structure called Boundary Tags, which allows us to calculate the location of the right and left blocks in memory. To calculate the location of the block to the right, all we need to know is the size of the current block. To calculate the location of the block to the left, we must also maintain the size of the block to the left in each block’s metadata. Now we can find the neighboring blocks in O(1) time instead of O(n).
Multiple Free Lists
- Naive solution: So far, we have assumed a single free list containing all free blocks.To find a block large enough to satisfy a request, we must iterate over all the blocks to find a block large enough to fulfill the request.
- Optimized solution: We can use multiple free lists. We create n free lists, one for each allocation size (8, 16, … , 8*(n-1), 8*n bytes.) That way, when a user requests memory, we can jump directly to the list representing blocks that are the correct size instead of looking through a general list. If that list is empty, the next non-empty list will contain the block best fitting the allocation request. However, we only have n lists, so if the user requests 8*n bytes of memory or more, we fall back to the naive approach and scan the final list for blocks that can satisfy the request. This optimization cannot guarantee an O(1) allocation time for all allocations. Still, for any allocation under 8*n, the allocation time is O(number of free lists) as opposed to O(length of the free list).
Getting additional chunks from the OS
The allocator may be unable to find a fit for the requested block. If the free blocks are already maximally coalesced, then the allocator asks the kernel for additional heap memory by calling mmap . The allocator transforms the additional memory into one large free block, inserts the block in the free list, and then places the requested block in this new free block.
As we know already, when an application requests a block of k bytes, the allocator searches the free list for a free block that is large enough to hold the requested block.
Placement policy dictates the manner in which the allocator performs this search. There are three popular policies.
- First fit: Search the free list from the beginning and choose the first free block that fits.
- Next fit: Similar to first fit, but start each search where the previous one left off.
- Best fit: Examine every free block and choose the free block with the smallest size that fits.
本网站支持 Alipay WeChatPay PayPal等支付方式
E-mail: email@example.com 微信号:vipnxx