This project involved developing a dynamic memory allocator as part of the malloc lab, which required implementing the malloc, free, realloc, and calloc functions. The primary goal was to create an allocator that is both correct and efficient while maximizing throughput and utilization.

Abstract

This project involved developing a dynamic memory allocator as part of the malloc lab, which required implementing the malloc, free, realloc, and calloc functions. The development process included converting to an explicit-list allocator, implementing a segregated free list, and exploring techniques to minimize block size. Throughout the lab, various strategies were tested, including different block management designs and memory allocation techniques.

Commands to Execute

-mdriver -C
-checkpoint grade command: make submit -checkpoint
-mdriver-emulate
-mdriver-dbg
-./mdriver-dbg -f ./traces/syn-array-short.rep

Roadmap

Improving Throughput
Convert to an explicit-list allocator.
Convert to a segregated free list allocator.
These changes should not significantly affect utilization.

  1. function to implement
    a. coalesece_block()

    b. write mm_checkheap()

    1. check header alignment √
    2. check block loop case √
    3. try to print out header and footer/ to ensure they are consistent √
    4. check coalescing √
    5. check for epilogue and prologue blocks
  2. find the footer
    line670 asize = round_up(size + dsize, dsize);
    dsize = 16 => footer is at the end of block

  3. explicit list

    1. design node struct: word_t* next, *prev; (use union to simplify code logics)
    2. implement insert_block(block_t *bp), remove_block(block_t *bp)
    3. set the minimum malloc size to 32 bytes (header, footer, next, prev).
    4. initialize with init_mm.
    5. call insert_block and remove_block during free and coalesce.
  4. segregated list

    1. since a 16-byte block requires footer adjustments, start by implementing the segregated list with sizes of 32 bytes and larger.
    2. modify only the insert_block function.
    3. for 16B size block, we can utilize the last 3 or 4 bit, addr | bit
      wrong: header and footer take 16B, if footer is resevered, no space for payload
    4. the choice of first-fit or best-fit can make a significant difference in utilization.

checkpoint: use first-fit and final use best-fit

  1. eliminating footer
    1. alloc: set next block 1 << 1
      free: unset next block 1 << 1 bit
    2. decompose write_block() into write_block_alloc() and write_block_free()
    3. renew checkheap()
    4. update fucntion
      alloc
      free
      find_fit

Q&A

  1. no space for footer?
    footer is only provided in spare blocks.
    should i add a footer manually? No.

Final minimum block design

the most difficult part of this lab is Decrease the minimum block size.

1
2
3
4
5
6
7
8
9
10
11
12
13
typedef struct mini_block {
union {
word_t header;
void *succ;
};

union {
word_t footer;
void *pred;
char payload[8];
};

} mini_block_t;

Other tricks to get full credits

  1. Since the allowed global memory is only 128 bytes, and 8 bytes are occupied by void *heap_start, 15 extra pointers are used to manage segregated lists. The block size for each list is somewhat tricky. Here is my final version of the block sizes for each list, where list_i stores blocks with sizes <= list_i and > list_i-1:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#define SIZE_0 16
#define SIZE_1 32
#define SIZE_2 48
#define SIZE_3 80
#define SIZE_4 144
#define SIZE_5 240
#define SIZE_6 360
#define SIZE_7 528

#define SIZE_8 748
#define SIZE_9 1440
#define SIZE_10 2880
#define SIZE_11 5760
#define SIZE_12 11520
#define SIZE_13 23040
#define SIZE_14 46080
  1. Better fit algorithm is absolutely better.

Autograder output

Unused design

  1. Reduce Headers Below 8 Bytes:
    Set the header to be an int (4 bytes) for blocks where distance + block_size is smaller than max_int.
    The highest 16 bits are never used in a VM with 16x addressing, so we can utilize them

  2. Special Memory Regions for Small, Fixed-Size Blocks:
    Set up special regions of memory for small, fixed-size blocks.
    This should theoretically improve utility, but I found that it only improved by 0.1%. It’s possible I forgot to free them completely when the tiny_block_pool is emptied.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
/**
* the memory_pool version of block_t
* succ and prev are relative offset respect to header, must < 4096 - 24
*/
typedef struct _tiny_block {
union {
struct {
unsigned char lo;
unsigned char hi;
} tiny_header_chars;
short tiny_header;
};

union {
struct {
short succ;
short prev;
};
char payload[14];
};
} tiny_block_t;

/** @brief special designed block to store tiny data,
* for consistancy, follow the block design
*/
typedef struct _memory_pool_block {
/** @brief Header contains size fiexd size 4096
* + allocation flag take use of the third last bits
* [x, pre_alloc_bit, alloc_bit]: x=1 -> special memory pool
*/
word_t header;

void *next_pool;

/** @brief next free space: offset referred to this addr in this block,
* must < 4096 - 24 = 4072
*/
int next_offset;

/** @brief alignment padding
*/
char padding[2];

/** @brief higher 8 bits is the index, from 0 to 253
*/
union {
struct {
char lo;
char hi;
} tiny_header_chars;
short tiny_header;
};

// fixed size payload size = 14
char payload[0];

} memory_pool_block;

Lessons Learned

In this lab, i can try different types of design as i want. however, make a good design with as few bug as possible is not so easy. it takes me 3 or 4 days to finally settle down the final version of my mini_block. the last throwed design is struct {word_t header, word_t succ}, and i suppose to take use of an other link list to represent mini block. But there are several bugs and at that time it is hard to tell whether the bug lives in my link list or in my mini block. then i rollback to previous version and make my finally design. eventhough, i still meet some unforseen bugs as firstly a want to set length 16 in the header beside the last third bits. such behavior corrupt my succ pointer unexpectedly. after i throw length away, i finally pass without bugs. It is extremly essential to make a solid design before wrinting codes in such flexible tasks.

Before preceeding tasks, which will modify previous code or need to do some optimization, it is important to make my code consice and modole. if i modify my code repeatly, unfixed bugs accumulate and things become worse and worse when design become more and more complicated.

Unsure question

When make up a memory pool, what is the pre_alloc_state?
resolved, this block is strictly after a written block

Unfixed bug: when call extend_heap, we should write next block pre_alloc_state true
solution: call write_block_header() before write_block? line 1404