CS360 Lecture notes -- Malloc Lecture #1

  • James S. Plank
  • Directory: /home/plank/cs360/notes/Malloc1
  • Lecture notes: http://web.eecs.utk.edu/~jplank/plank/classes/cs360/360/notes/Malloc1/lecture.html
  • Original Lecture Notes: 1996
  • Most recent modification: Thu Mar 8 10:29:50 EST 2018
  • The material Stephen Marz used when he taught this lecture in 2017.
    Caveat from 2001: Read the lecture before trying these programs. Unfortunately, the implementation of malloc() and free() on our lab machines will not match the description in these notes 100 percent. That is because these notes were originally written in the mid-1990's and malloc implementations seem to change every few years.

    I stick with this description because it is clear. However, it is not how things really work these days.


    Caveat from 2014: Also, sbrk() and brk() are no longer used to implement malloc(). Instead, the system call is mmap(). I don't think the complexities of mmap() are worth exploring in CS360. You may want to explore them on your own. The first important thing is for you to realize that malloc() is a buffered interface on top of a system call that allows the operating system to give you memory. In these lectures, that system call is sbrk(), although in reality it is mmap().

    If you compile this code to run it, you may get a pack of warnings about sbrk() being deprecated. Ignore them.

    The second important thing is that malloc() has to do bookkeeping somehow, and one way to do that is to use the bytes before the pointer that it returns. That is the topic of this lecture.


    This is a lecture about sbrk() and malloc().

    Last class we went over some general memory stuff -- we learned that the last address in the code segment is &etext, and the last address in the globals segment is &end. As the program runs, and memory is allocated from the heap using malloc(), the heap grows. To figure out the boundary of the heap, we must use brk() or sbrk(). Both are system calls, and you can read their man pages. We will only discuss sbrk() as it is the only call you will need.

    caddr_t sbrk(int incr);
    
    A caddr_t is a "c address pointer". It is the same as a (char *) or a (void *).

    This specifies for the operating system to give incr more bytes to the heap. It returns a pointer to the end of the heap before sbrk() was called. Thus, the new end of the heap after an sbrk() call is at address

        sbrk(incr) + incr;
    
    If you call sbrk(0), then it returns the current end of the heap.

    Now, malloc() (and the related programs realloc() and calloc()) all call sbrk() to get the memory to allocate in the heap. They are the only routines that call sbrk(). Thus, the only way that you can get memory in the heap is through malloc() or sbrk(). However, you should use malloc(), as it is more efficient.


    Let's try it out. Look at fb2.c:

    #include <stdio.h>
    #include <stdlib.h>
    #include <sys/types.h>
    
    int main()
    {
      int *i1, *i2;
    
      printf("sbrk(0) before malloc(4): 0x%x\n", sbrk(0));
    
      i1 = (int *) malloc(4);
      printf("sbrk(0) after `i1 = (int *) malloc(4)': 0x%x\n", sbrk(0));
    
      i2 = (int *) malloc(4);
      printf("sbrk(0) after `i2 = (int *) malloc(4)': 0x%x\n", sbrk(0));
    
      printf("i1 = 0x%x, i2 = 0x%x\n", (unsigned int) i1, (unsigned int) i2);
    }
    

    This prints sbrk(0) before and after some malloc() calls. Here's the result of running it on a Linux box in 2014:

    UNIX> ./fb2
    sbrk(0) before malloc(4): 0x21ab0
    sbrk(0) after `i1 = (int *) malloc(4)': 0x23ab0
    sbrk(0) after `i2 = (int *) malloc(4)': 0x23ab0
    i1 = 0x21ac0, i2 = 0x21ad0
    UNIX>
    
    As you can see, the first malloc() call changed the return value of sbrk(0). The second one did not. You'll also note that the difference between the first two sbrk(0) calls is

    (0x23ab0 - 0x21ab0 = 0x2000 = 8K)

    This number will be different from system to system. On my Raspberry Pi in 2018, it was 33K. However all systems will share one feature: The number is a lot bigger than the four bytes that we asked for. The reason it does that is buffering. Since sbrk() is a system call, it is expensive. By calling it with a large number, malloc() can satisfy a lot of requests for smaller blocks of memory with just one system call to sbrk().

    To repeat -- malloc() calls sbrk() with a large number which it treats as a buffer. Then it doles out memory from this buffer. After i1 and i2 are allocated, there is still a whole bunch of memory -- from 0x0x21ad0 to 0x0x23ab0 that malloc() can use before calling sbrk() again. This is roughly 8160 bytes. Thus, in fb2a.c, when we do a malloc(8164) after allocating i1 and i2, we expect to see that sbrk() was called to get more memory, and indeed this is the case:

    UNIX> ./fb2a
    sbrk(0) before malloc(4): 0x21b68
    sbrk(0) after `i1 = (int *) malloc(4)': 0x23b68
    sbrk(0) after `i2 = (int *) malloc(4)': 0x23b68
    i1 = 0x21b78, i2 = 0x21b88, sbrk(0)-i2 = 8160
    sbrk(0) after `i3 = (int *) malloc(8164)': 0x25b68
    i3 = 0x21f78
    UNIX>
    
    Now, look at fb3.c. This calls malloc(4) 10 times and prints out the memory allocated: fb3.c

    #include <stdio.h>
    #include <stdlib.h>
    
    int main()
    {
      int j, *buf;
    
      for (j = 0; j < 10; j++) {
        buf = (int *) malloc(4);
        printf("malloc(4) returned 0x%x\n", (unsigned int) buf);
      }
      return 0;
    }
    

    UNIX> ./fb3
    malloc(4) returned 0x219d0
    malloc(4) returned 0x219e0
    malloc(4) returned 0x219f0
    malloc(4) returned 0x21a00
    malloc(4) returned 0x21a10
    malloc(4) returned 0x21a20
    malloc(4) returned 0x21a30
    malloc(4) returned 0x21a40
    malloc(4) returned 0x21a50
    malloc(4) returned 0x21a60
    UNIX>
    
    You'll note that each return value from malloc() is 16 bytes greater than the previous one. You might expect it to be only 4 bytes greater since it is only allocating 4 bytes. What is happening is that malloc() allocates some extra bytes each time it is called so that it can do bookkeeping. These extra bytes help out when free() is called. These extra bytes are often allocated before the returned memory. You'll see why when we start to look at free().

    Look at fb4.c. What this does is allocate a whole bunch of memory regions using malloc(), and then it prints out their starting addresses, and the values that are located one and two integers before the starting addresses. Again, this is the kind of code which (for good reason) most programmers deem as ``unsafe''. However, it's the only way to check out these things. As you can see, two integers before the return value from malloc() contains how many bytes were actually allocated. This is a little confusing, so lets look at the output of fb4 in detail: (Again, on different systems, malloc() works in different ways. I illustrate some examples in this note).

    UNIX> ./fb4
    sbrk(0) = 0x61a0
    Allocated 4 bytes.  buf = 0x61a8, buf[-1] = 0, buf[-2] = 16, buf[0] = 1000
    Allocated 8 bytes.  buf = 0x61b8, buf[-1] = 0, buf[-2] = 16, buf[0] = 1001
    Allocated 12 bytes.  buf = 0x61c8, buf[-1] = 0, buf[-2] = 24, buf[0] = 1002
    Allocated 16 bytes.  buf = 0x61e0, buf[-1] = 0, buf[-2] = 24, buf[0] = 1003
    Allocated 20 bytes.  buf = 0x61f8, buf[-1] = 0, buf[-2] = 32, buf[0] = 1004
    Allocated 24 bytes.  buf = 0x6218, buf[-1] = 0, buf[-2] = 32, buf[0] = 1005
    Allocated 28 bytes.  buf = 0x6238, buf[-1] = 0, buf[-2] = 40, buf[0] = 1006
    Allocated 100 bytes.  buf = 0x6260, buf[-1] = 0, buf[-2] = 112, buf[0] = 1007
    sbrk(0) = 0x70f8
    UNIX>
    
    So, look at the heap after the first call to malloc(), and buf[0] is set to i = 1000:
             |---------------|  
             |      ...      | 
             |               |      
             |      16       | 0x61a0
             |               | 0x61a4     
             |     1000      | 0x61a8  <--------- return value
             |               | 0x61ac
             |               | 0x61b0
             |               | 0x61b4
             |      ...      |      
             |               |      
             |               |      
             |               |      
             |---------------| 0x70f8 (sbrk(0));
    
    When malloc() is called a second time (buf = malloc(8)), malloc() returns 0x61b8. After buf[0] is set to i = 1001, the heap looks as follows:
             |---------------|  
             |      ...      | 
             |               |      
             |      16       | 0x61a0
             |               | 0x61a4     
             |     1000      | 0x61a8  
             |               | 0x61ac
             |      16       | 0x61b0
             |               | 0x61b4
             |     1001      | 0x61b8  <--------- return value
             |               | 0x61bc
             |               | 0x61c0
             |               | 0x61c4
             |      ...      |      
             |               |      
             |               |      
             |---------------| 0x70f8 (sbrk(0));
    
    And so on -- when the final sbrk(0) is called, the heap looks as follows:
             |---------------| 
             |      ...      |
             |               | 
             |      16       | 0x61a0
             |               | 0x61a4
             |     1000      | 0x61a8
             |               | 0x61ac
             |      16       | 0x61b0
             |               | 0x61b4
             |     1001      | 0x61b8 
             |               | 0x61bc
             |      24       | 0x61c0
             |               | 0x61c4
             |     1002      | 0x61c8 
             |               | 0x61cc
             |               | 0x61d0
             |               | 0x61d4
             |      24       | 0x61d8 
             |               | 0x61dc
             |     1003      | 0x61e0
             |               | 0x61e4
             |               | 0x61e8 
             |               | 0x61ec
             |      32       | 0x61f0
             |               | 0x61f4
             |     1004      | 0x61f8 
             |               | 0x61fc
             |               | 0x6200
             |               | 0x6204
             |               | 0x6208 
             |               | 0x620c
             |      32       | 0x6210
             |               | 0x6214
             |     1005      | 0x6218 
             |               | 0x621c
             |               | 0x6220
             |               | 0x6224
             |               | 0x6228 
             |               | 0x622c
             |      40       | 0x6230
             |               | 0x6234
             |     1006      | 0x6238 
             |               | 0x623c
             |               | 0x6240
             |               | 0x6244
             |               | 0x6248 
             |               | 0x624c
             |               | 0x6250
             |               | 0x6254
             |     112       | 0x6258 
             |               | 0x625c
             |     1007      | 0x6260
             |               | 0x6264
             |      ...      |
             |               |
             |               |
             |---------------| 0x70f8 (sbrk(0));
    
    Once again, malloc() calls sbrk() to get memory from the operating system into a buffer, and then it doles out the memory from that buffer as it is called. When it runs out of buffer space, it calls sbrk() to get more.

    Why do malloc(4) and malloc(8) allocate 16 bytes, and malloc(12) and malloc(16) allocate 24? Because malloc() pads out the memory allocated to multiples of 8 bytes. Thus malloc(4) and malloc(8) allocate 8 bytes for the user, plus an extra 8 bytes for bookkeeping. Malloc(12) and malloc(16) allocate 16 bytes for the user, plus an extra 8 bytes for bookkeeping for a total of 24 bytes. Malloc(100) allocates 104 bytes for the user, plus an extra 8 bytes for bookkeeping.

    Why does malloc() perform this padding? So that the addresses returned will be multiples of eight, and thus will be valid for pointers of any type. Suppose malloc() didn't do this, and instead could return any pointer. Then if you did the following:

      int *i;
    
      i = (int *) malloc(4);
      *i = 4;
    
    you might generate a bus error, because malloc() may return a value that is not a multiple of 4. As it is, malloc() returns multiples of eight, so that pointers to doubles and long integers will not cause bus errors.

    How does malloc() know where to dole memory from? It uses a global variable or two. For example, it may have two global variables defined as follows:

    char *malloc_begin = NULL;
    char *malloc_end = NULL;
    
    When malloc() is called, it first checks to see if malloc_begin equals NULL. If so, it calls sbrk() to get a buffer. It uses malloc_begin and malloc_end to denote the beginning and end of that buffer. As malloc() gets called further, it doles out memory from the beginning of the buffer, and updates malloc_begin accordingly. If there is not enough room in the buffer, then sbrk() is called to get more buffer space, and malloc_end is incremented to denote the enlarged buffer.

    Now, this describes how to write a simple malloc() with no free() calls. When free() gets called, you should have malloc() be able to reuse that memory. This means that you have to do something more sophisticated with malloc(). We'll talk about it in Malloc lecture #2. Think about it in the meantime.