You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* fix wording about third_alloc() in "Heap Allocation"
* update wording in "Heap Allocation"
* fix incorrect variable name in "Heap Allocation"
* fix typos in "Heap Allocation"
One thing that might have been noticed so far is that for keep track of all those new information we are adding an overhead to our allocator. How big this overhead is depends on the size of the variables we use in the chunk headers (where we store the alloc size and status). Even if we keep things small by only using `uint8_t`, we have already added 2 bytes of overhead for every single allocation.
214
+
One thing that might have been noticed so far is that in order to keep track of all those new information, we are adding an overhead to our allocator. How big this overhead is depends on the size of the variables we use in the chunk headers (where we store the alloc size and status). Even if we keep things small by only using `uint8_t`, we have already added 2 bytes of overhead for every single allocation.
215
215
The implementation above is not completed yet, since we haven't implemented a mechanism to re-use the freed location but before adding this last piece let's talk about the free.
216
216
217
217
Now we know that given a pointer `ptr` (previously allocated from our heap, of course) `ptr - 1` is the status (and should be USED) and `ptr - 2` is the size.
@@ -373,9 +373,9 @@ So now our heap node will look like the following in memory:
373
373
|----|------|-------|------|-----|
374
374
| 6 | F/U | PREV | NEXT | X |
375
375
376
-
As mentioned earlier using the double linked list the check for merge-ability is more straightforward. For example to check if we can merge with the left node we just need to check the status of the node pointed by the prev field, if it is freer than they can be merged. To merge with the previous node would apply the logic below to `node->prev`:
376
+
As mentioned earlier using the double linked list the check for merge-ability is more straightforward. For example to check if we can merge with the left node we just need to check the status of the node pointed by the prev field, if it is free then they can be merged. To merge with the previous node, we would apply the logic below to `node->prev`:
377
377
378
-
* Update the `size`its, adding to it the size of cur_node
378
+
* Update the `size`field, adding to it the size of cur_node
379
379
* Update the `next` pointer to point to cur_node->next
380
380
381
381
Referring to the next node:
@@ -389,7 +389,7 @@ Of course merging with the right node is the opposite (update the size and the p
389
389
Below a pseudocode example of how to merge left:
390
390
391
391
```c
392
-
Heap_Node *prev_node = cur_node->prev //cur_pointer is the node we want to check if can be merged
392
+
Heap_Node *prev_node = cur_pointer->prev //cur_pointer is the node we want to check if can be merged
393
393
if (prev_node != NULL && prev_node->status == FREE) {
394
394
// The prev node is free, and cur node is going to be freed so we can merge them
What we're describing here is the left node being "swallowed" by the right one, and growing in size. The memory that the left node owns and is responsible for is now part of the right oneTo make it easier to understand, consider the portion of a hypothetical heap in the picture below:
403
+
What we're describing here is the left node being "swallowed" by the right one, and growing in size. The memory that the left node owns and is responsible for is now part of the right one. To make it easier to understand, consider the portion of a hypothetical heap in the picture below:
404
404
405
405

406
406
@@ -434,8 +434,8 @@ Now `alloc()` is called again, like so:
434
434
alloc(10);
435
435
```
436
436
The allocator is going to look for the first node it can return that is at least 10 bytes. Using the example from above, this will be the first node. Everything looks fine, except that we've just returned 150 bytes for a 10 byte allocation (i.e. ~140 bytes of memory is wasted). There are a few ways to approach this problem:
437
-
- The first solution that comes to mind if to scan the entire heap each time and use the smallest (but still big enough for the requested size) node. This is better, but the downside (speed) should be obvious. This will also still not work as well as the second solution because in the above the example it would still return 150 bytes.
438
-
- What we're going to do is 'cut' the space that we need from the node, splitting it into 2 new nodes. The first node will be the request allocation size, and is used to fulfill that. The other will be kept as a free node, inserted into the linked list.
437
+
* The first solution that comes to mind is to scan the entire heap each time and use the smallest (but still big enough for the requested size) node. This is better, but the downside (speed) should be obvious. This will also still not work as well as the second solution because in the above example it would still return 150 bytes.
438
+
* What we're going to do is 'cut' the space that we need from the node, splitting it into 2 new nodes. The first node will be the request allocation size, and is used to fulfill that. The other will be kept as a free node, inserted into the linked list.
439
439
440
440
The workflow will be the following:
441
441
@@ -446,12 +446,12 @@ The workflow will be the following:
446
446
* One edge case to be aware of here is if node that was split was the last node of the heap, The `heap_tail` variable should be updated as well, if it is being used (this depends on design decisions).
447
447
448
448
449
-
After that the allocator can compute the address to return using `(uintptr_t)cur_node + sizeof(Heap_node)`, since we want to return the memory *after* the node, not the node itself (otherwise the program would put data there and overwrite what we've stored there!).
449
+
After that the allocator can compute the address to return using `(uintptr_t)cur_node + sizeof(Heap_node)`, since we want to return the memory *after* the node, not the node itself (otherwise the program would put data there and overwrite what we've stored!).
450
450
451
451
Before wrapping up there's a few things worth pointing out about implementing splitting:
452
452
453
-
* Remember that every node has some overhead, so when splitting we shouldn't have nodes smaller (or equal to) than`sizeof(Heap_Node)`, because otherwise they will never be allocated.
454
-
* It's a good idea to have a minimum size for the memory a chunk can contain, to avoid having a large number of nodes and for easy alignment later on. For example if the `minimum_allocatable_size` is 0x20 bytes, and we want to allocate 5 bytes, we will still receive a memory block of `0x20` bytes. The program may not know it was returned `0x20` bytes, but that is okay. What exactly value should be used for it is implementation specific, values of `0x10` and `0x20` are popular.
453
+
* Remember that every node has some overhead, so when splitting we shouldn't have nodes smaller than (or equal to) `sizeof(Heap_Node)`, because otherwise they will never be allocated.
454
+
* It's a good idea to have a minimum size for the memory a chunk can contain, to avoid having a large number of nodes and for easy alignment later on. For example if the `minimum_allocatable_size` is 0x20 bytes, and we want to allocate 5 bytes, we will still receive a memory block of `0x20` bytes. The program may not know it was returned `0x20` bytes, but that is okay. What exact value should be used for it is implementation specific, values of `0x10` and `0x20` are popular.
455
455
* Always remember that there is the memory footprint of `sizeof(Heap_Node)` bytes while computing sizes that involve multiple nodes. If we decide to include the overhead size in the node's size, remember to also subtract it when checking for suitable nodes.
456
456
457
457
And that's it!
@@ -476,7 +476,7 @@ void initialize_heap() {
476
476
}
477
477
```
478
478
479
-
Now the question is, how do we choose the starting address? This really is arbitrary. We can pick any address that we like, but there are a few constraints that we should follow:
479
+
Now the question is, how do we choose the starting address? This really is arbitrary. We can pick any address that we like, but there are a few constraints that we should follow:
480
480
481
481
* Some memory is used by the kernel, we don't want to overwrite anything with our heap, so let's keep sure that the area we are going is free.
482
482
* Usually when paging is enabled, in many case the kernel is moved to one half of the memory space (usually referred as to HIGHER_HALF and LOWER_HALF) so when deciding the initial address we should place it in the correct half, so if the kernel is placed in the HIGHER half, and we are implementing the kernel heap it should go on the HIGHER half and if it is for the user space heap it will go to the LOWER half.
@@ -491,14 +491,14 @@ And that's it, that is how the heap is initialized with a single node. The first
491
491
492
492
One final part that we will be explained briefly, is what happens when we reach the end of the heap. Imagine the following scenario we have done a lot of allocations, most of the heap nodes are used and the few usable nodes are small. The next allocation request will fail to find a suitable node because the requested size is bigger than any free node available. Now the allocator has searched through the heap, and reached the end without success. What happens next? Time to expand the heap by adding more memory to the end of it.
493
493
494
-
Here is where the virtual memory manager will join the game. Roughly what will is:
494
+
Here is where the virtual memory manager will join the game. Roughly what will happen is:
495
495
496
496
* The heap allocator will first check if we have reached the end of the address space available (unlikely).
497
497
* If not it will ask the VMManager to map a number of pages (exact number depends on implementation) at the address starting from `heap_end + heap_end->size + sizeof(heap_node)`.
498
498
* If the mapping fail, the allocation will fail as well (i.e. out of memory/OOM. This is an issue to solve in its own right).
499
499
* If the mapping is successful, then we have just created a new node to be appended to the current end of the heap. Once this is done we can proceed with the split if needed.
500
500
501
-
And with that we're just written a fairly complete heap allocator.
501
+
And with that we've just written a fairly complete heap allocator.
502
502
503
503
A final note: in these examples we're not zeroing the memory returned by the heap, which languages like C++ may expect when `new` and `delete` operators are used. This can lead to non-deterministic bugs where objects may be initialized with left over values from previous allocations (if the memory has been used before), and suddenly default construction is not doing what is expected.
504
504
Doing a `memset()` on each block of memory returned does cost cpu time, so it's a trade off, a decision to be made for your specific implementation.
0 commit comments