Introduction
ELF (Executable and Linkable Format) is the file format of the executable file under Unix system. This post will give a basic demonstration on the ELF file format and briefly explain how dynamic link works. First of all, I will present the structure of an object file and basic data structures to resolve an ELF file. Next I will further present the structure of an executable file and how the dynamic linking works under unix system.
We will use the following code as sample code to display the details in the ELF format.
#include <stdio.h> #include <string.h> #include <stdlib.h> int main(int argc, char *argv[]) { char *buf1, *buf2, *buf3; if (argc != 4) return; buf1 = malloc(256); strcpy(buf1, argv[1]); buf2 = malloc(strtoul(argv[2], NULL, 16)); buf3 = malloc(256); strcpy(buf3, argv[3]); free(buf3); free(buf2); free(buf1); return 0; }
Object File
We can use the -c option in gcc to get the object file of target function, and use hex command to dump the content of object file in hex format.
$ gcc -c test1.c $ hex test1.o 0x00000000: 7f 45 4c 46 01 01 01 00 - 00 00 00 00 00 00 00 00 ?ELFAAA@@@@@@@@@ 0x00000010: 01 00 03 00 01 00 00 00 - 00 00 00 00 00 00 00 00 A@C@A@@@@@@@@@@@ 0x00000020: 60 01 00 00 00 00 00 00 - 34 00 00 00 00 00 28 00 `A@@@@@@4@@@@@(@ 0x00000030: 0a 00 07 00 55 89 e5 83 - e4 f0 83 ec 20 83 7d 08 J@G@U...�.. .}H 0x00000040: 04 74 05 e9 a5 00 00 00 - c7 04 24 00 01 00 00 e8 DtE@@@.D$@A@@� 0x00000050: fc ff ff ff 89 44 24 1c - 8b 45 0c 83 c0 04 8b 00 .....D$\.EL..D.@ 0x00000060: 89 44 24 04 8b 44 24 1c - 89 04 24 e8 fc ff ff ff .D$D.D$\.D$�... 0x00000070: 8b 45 0c 83 c0 08 8b 00 - c7 44 24 08 10 00 00 00 .EL..H.@.D$HP@@@ 0x00000080: c7 44 24 04 00 00 00 00 - 89 04 24 e8 fc ff ff ff .D$D@@@@.D$�... 0x00000090: 89 04 24 e8 fc ff ff ff - 89 44 24 18 c7 04 24 00 .D$�....D$X.D$@ 0x000000a0: 01 00 00 e8 fc ff ff ff - 89 44 24 14 8b 45 0c 83 A@@�....D$T.EL. 0x000000b0: c0 0c 8b 00 89 44 24 04 - 8b 44 24 14 89 04 24 e8 .L.@.D$D.D$T.D$� 0x000000c0: fc ff ff ff 8b 44 24 14 - 89 04 24 e8 fc ff ff ff .....D$T.D$�... 0x000000d0: 8b 44 24 18 89 04 24 e8 - fc ff ff ff 8b 44 24 1c .D$X.D$�....D$\ 0x000000e0: 89 04 24 e8 fc ff ff ff - b8 00 00 00 00 c9 c3 00 .D$�....@@@@�@ 0x000000f0: 00 47 43 43 3a 20 28 55 - 62 75 6e 74 75 20 34 2e @GCC: (Ubuntu 4. 0x00000100: 34 2e 33 2d 34 75 62 75 - 6e 74 75 35 2e 31 29 20 4.3-4ubuntu5.1) 0x00000110: 34 2e 34 2e 33 00 00 2e - 73 79 6d 74 61 62 00 2e 4.4.3@@.symtab@. 0x00000120: 73 74 72 74 61 62 00 2e - 73 68 73 74 72 74 61 62 strtab@.shstrtab 0x00000130: 00 2e 72 65 6c 2e 74 65 - 78 74 00 2e 64 61 74 61 @.rel.text@.data 0x00000140: 00 2e 62 73 73 00 2e 63 - 6f 6d 6d 65 6e 74 00 2e @.bss@.comment@. 0x00000150: 6e 6f 74 65 2e 47 4e 55 - 2d 73 74 61 63 6b 00 00 note.GNU-stack@@ 0x00000160: 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 @@@@@@@@@@@@@@@@ 0x00000170: 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 @@@@@@@@@@@@@@@@ 0x00000180: 00 00 00 00 00 00 00 00 - 1f 00 00 00 01 00 00 00 @@@@@@@@_@@@A@@@ 0x00000190: 06 00 00 00 00 00 00 00 - 34 00 00 00 bb 00 00 00 F@@@@@@@4@@@.@@@ 0x000001a0: 00 00 00 00 00 00 00 00 - 04 00 00 00 00 00 00 00 @@@@@@@@D@@@@@@@ 0x000001b0: 1b 00 00 00 09 00 00 00 - 00 00 00 00 00 00 00 00 [@@@I@@@@@@@@@@@ 0x000001c0: dc 03 00 00 48 00 00 00 - 08 00 00 00 01 00 00 00 .C@@H@@@H@@@A@@@ 0x000001d0: 04 00 00 00 08 00 00 00 - 25 00 00 00 01 00 00 00 D@@@H@@@%@@@A@@@ 0x000001e0: 03 00 00 00 00 00 00 00 - f0 00 00 00 00 00 00 00 C@@@@@@@.@@@@@@@ 0x000001f0: 00 00 00 00 00 00 00 00 - 04 00 00 00 00 00 00 00 @@@@@@@@D@@@@@@@ 0x00000200: 2b 00 00 00 08 00 00 00 - 03 00 00 00 00 00 00 00 +@@@H@@@C@@@@@@@ 0x00000210: f0 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 .@@@@@@@@@@@@@@@ 0x00000220: 04 00 00 00 00 00 00 00 - 30 00 00 00 01 00 00 00 D@@@@@@@0@@@A@@@ 0x00000230: 30 00 00 00 00 00 00 00 - f0 00 00 00 26 00 00 00 0@@@@@@@.@@@&@@@ 0x00000240: 00 00 00 00 00 00 00 00 - 01 00 00 00 01 00 00 00 @@@@@@@@A@@@A@@@ 0x00000250: 39 00 00 00 01 00 00 00 - 00 00 00 00 00 00 00 00 9@@@A@@@@@@@@@@@ 0x00000260: 16 01 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 VA@@@@@@@@@@@@@@ 0x00000270: 01 00 00 00 00 00 00 00 - 11 00 00 00 03 00 00 00 A@@@@@@@Q@@@C@@@ 0x00000280: 00 00 00 00 00 00 00 00 - 16 01 00 00 49 00 00 00 @@@@@@@@VA@@I@@@ 0x00000290: 00 00 00 00 00 00 00 00 - 01 00 00 00 00 00 00 00 @@@@@@@@A@@@@@@@ 0x000002a0: 01 00 00 00 02 00 00 00 - 00 00 00 00 00 00 00 00 A@@@B@@@@@@@@@@@ 0x000002b0: f0 02 00 00 c0 00 00 00 - 09 00 00 00 07 00 00 00 .B@@.@@@I@@@G@@@ 0x000002c0: 04 00 00 00 10 00 00 00 - 09 00 00 00 03 00 00 00 D@@@P@@@I@@@C@@@ 0x000002d0: 00 00 00 00 00 00 00 00 - b0 03 00 00 29 00 00 00 @@@@@@@@.C@@)@@@ 0x000002e0: 00 00 00 00 00 00 00 00 - 01 00 00 00 00 00 00 00 @@@@@@@@A@@@@@@@ 0x000002f0: 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 @@@@@@@@@@@@@@@@ 0x00000300: 01 00 00 00 00 00 00 00 - 00 00 00 00 04 00 f1 ff A@@@@@@@@@@@D@.. 0x00000310: 00 00 00 00 00 00 00 00 - 00 00 00 00 03 00 01 00 @@@@@@@@@@@@C@A@ 0x00000320: 00 00 00 00 00 00 00 00 - 00 00 00 00 03 00 03 00 @@@@@@@@@@@@C@C@ 0x00000330: 00 00 00 00 00 00 00 00 - 00 00 00 00 03 00 04 00 @@@@@@@@@@@@C@D@ 0x00000340: 00 00 00 00 00 00 00 00 - 00 00 00 00 03 00 06 00 @@@@@@@@@@@@C@F@ 0x00000350: 00 00 00 00 00 00 00 00 - 00 00 00 00 03 00 05 00 @@@@@@@@@@@@C@E@ 0x00000360: 09 00 00 00 00 00 00 00 - bb 00 00 00 12 00 01 00 I@@@@@@@.@@@R@A@ 0x00000370: 0e 00 00 00 00 00 00 00 - 00 00 00 00 10 00 00 00 N@@@@@@@@@@@P@@@ 0x00000380: 15 00 00 00 00 00 00 00 - 00 00 00 00 10 00 00 00 U@@@@@@@@@@@P@@@ 0x00000390: 1c 00 00 00 00 00 00 00 - 00 00 00 00 10 00 00 00 \@@@@@@@@@@@P@@@ 0x000003a0: 24 00 00 00 00 00 00 00 - 00 00 00 00 10 00 00 00 $@@@@@@@@@@@P@@@ 0x000003b0: 00 74 65 73 74 31 2e 63 - 00 6d 61 69 6e 00 6d 61 @test1.c@main@ma 0x000003c0: 6c 6c 6f 63 00 73 74 72 - 63 70 79 00 73 74 72 74 lloc@strcpy@strt 0x000003d0: 6f 75 6c 00 66 72 65 65 - 00 00 00 00 1c 00 00 00 oul@free@@@@\@@@ 0x000003e0: 02 08 00 00 38 00 00 00 - 02 09 00 00 58 00 00 00 BH@@8@@@BI@@X@@@ 0x000003f0: 02 0a 00 00 60 00 00 00 - 02 08 00 00 70 00 00 00 BJ@@`@@@BH@@p@@@ 0x00000400: 02 08 00 00 8c 00 00 00 - 02 09 00 00 98 00 00 00 BH@@.@@@BI@@.@@@ 0x00000410: 02 0b 00 00 a4 00 00 00 - 02 0b 00 00 b0 00 00 00 BK@@.@@@BK@@.@@@ 0x00000420: 02 0b 00 00 - BK@@
To understand the meaning of the data displayed above, use the readelf command to display the structure of object file and use objdump commands to get assembly code of our sample code.
$ objdump -d test1.o test1.o: file format elf32-i386 Disassembly of section .text: 00000000 <main>: 0: 55 push %ebp 1: 89 e5 mov %esp,%ebp 3: 83 e4 f0 and $0xfffffff0,%esp 6: 83 ec 20 sub $0x20,%esp 9: 83 7d 08 04 cmpl $0x4,0x8(%ebp) d: 74 05 je 14 <main+0x14> f: e9 a5 00 00 00 jmp b9 <main+0xb9> 14: c7 04 24 00 01 00 00 movl $0x100,(%esp) 1b: e8 fc ff ff ff call 1c <main+0x1c> 20: 89 44 24 1c mov %eax,0x1c(%esp) 24: 8b 45 0c mov 0xc(%ebp),%eax 27: 83 c0 04 add $0x4,%eax 2a: 8b 00 mov (%eax),%eax 2c: 89 44 24 04 mov %eax,0x4(%esp) 30: 8b 44 24 1c mov 0x1c(%esp),%eax 34: 89 04 24 mov %eax,(%esp) 37: e8 fc ff ff ff call 38 <main+0x38> 3c: 8b 45 0c mov 0xc(%ebp),%eax 3f: 83 c0 08 add $0x8,%eax 42: 8b 00 mov (%eax),%eax 44: c7 44 24 08 10 00 00 movl $0x10,0x8(%esp) 4b: 00 4c: c7 44 24 04 00 00 00 movl $0x0,0x4(%esp) 53: 00 54: 89 04 24 mov %eax,(%esp) 57: e8 fc ff ff ff call 58 <main+0x58> 5c: 89 04 24 mov %eax,(%esp) 5f: e8 fc ff ff ff call 60 <main+0x60> 64: 89 44 24 18 mov %eax,0x18(%esp) 68: c7 04 24 00 01 00 00 movl $0x100,(%esp) 6f: e8 fc ff ff ff call 70 <main+0x70> 74: 89 44 24 14 mov %eax,0x14(%esp) 78: 8b 45 0c mov 0xc(%ebp),%eax 7b: 83 c0 0c add $0xc,%eax 7e: 8b 00 mov (%eax),%eax 80: 89 44 24 04 mov %eax,0x4(%esp) 84: 8b 44 24 14 mov 0x14(%esp),%eax 88: 89 04 24 mov %eax,(%esp) 8b: e8 fc ff ff ff call 8c <main+0x8c> 90: 8b 44 24 14 mov 0x14(%esp),%eax 94: 89 04 24 mov %eax,(%esp) 97: e8 fc ff ff ff call 98 <main+0x98> 9c: 8b 44 24 18 mov 0x18(%esp),%eax a0: 89 04 24 mov %eax,(%esp) a3: e8 fc ff ff ff call a4 <main+0xa4> a8: 8b 44 24 1c mov 0x1c(%esp),%eax ac: 89 04 24 mov %eax,(%esp) af: e8 fc ff ff ff call b0 <main+0xb0> b4: b8 00 00 00 00 mov $0x0,%eax b9: c9 leave ba: c3 ret
Given the information above, let me introduce how ELF format was resolved based on the source code [6].
0x00-0x33: ELF file header
The ELF file header occupies the first 52 bytes of the object file. And the data structure is shown as following:
typedef struct { unsigned char e_ident[EI_NIDENT]; /* File identification. */ Elf32_Half e_type; /* File type. */ Elf32_Half e_machine; /* Machine architecture. */ Elf32_Word e_version; /* ELF format version. */ Elf32_Addr e_entry; /* Entry point. */ Elf32_Off e_phoff; /* Program header file offset. */ Elf32_Off e_shoff; /* Section header file offset. */ Elf32_Word e_flags; /* Architecture-specific flags. */ Elf32_Half e_ehsize; /* Size of ELF header in bytes. */ Elf32_Half e_phentsize; /* Size of program header entry. */ Elf32_Half e_phnum; /* Number of program header entries. */ Elf32_Half e_shentsize; /* Size of section header entry. */ Elf32_Half e_shnum; /* Number of section header entries. */ Elf32_Half e_shstrndx; /* Section name strings section. */ } Elf32_Ehdr;
From the hex dump of the object file, the information that could be retrieved as following:
ELF Header: Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 Class: ELF32 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: REL (Relocatable file) Machine: Intel 80386 Version: 0x1 Entry point address: 0x0 Start of program headers: 0 (bytes into file) Start of section headers: 352 (bytes into file) Flags: 0x0 Size of this header: 52 (bytes) Size of program headers: 0 (bytes) Number of program headers: 0 Size of section headers: 40 (bytes) Number of section headers: 10 Section header string table index: 7
The offset of section header is 0x160 (352). Before reaching the section header, I need to explain the data between 0x34 and 0x160
0x34-0xef: Text Data
The data following the ELF header is the assembly code of the function. It is pretty much clear via checking the hex dump of object file and the output of objdump command.
0xf0-0x115: Comment Section
This section is usually filled by compiler with some compiler information.
0x116-0x15f: Section Header String Table
This section is used to resolve the name of each entry in section header. How the section name is resolved will be discussed next.
0x160-0x2ef: Section Header Table
Section header table contains the necessary information for each section in ELF file format, and its data structure is shown as following:
typedef struct { Elf32_Word sh_name; /* Section name (index into the section header string table). */ Elf32_Word sh_type; /* Section type. */ Elf32_Word sh_flags; /* Section flags. */ Elf32_Addr sh_addr; /* Address in memory image. */ Elf32_Off sh_offset; /* Offset in file. */ Elf32_Word sh_size; /* Size in bytes. */ Elf32_Word sh_link; /* Index of a related section. */ Elf32_Word sh_info; /* Depends on section type. */ Elf32_Word sh_addralign; /* Alignment in bytes. */ Elf32_Word sh_entsize; /* Size of each entry in section. */ } Elf32_Shdr;
After dumping via objdump we can view the information in the section header table as below.
Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .text PROGBITS 00000000 000034 0000bb 00 AX 0 0 4 [ 2] .rel.text REL 00000000 0003dc 000048 08 8 1 4 [ 3] .data PROGBITS 00000000 0000f0 000000 00 WA 0 0 4 [ 4] .bss NOBITS 00000000 0000f0 000000 00 WA 0 0 4 [ 5] .comment PROGBITS 00000000 0000f0 000026 01 MS 0 0 1 [ 6] .note.GNU-stack PROGBITS 00000000 000116 000000 00 0 0 1 [ 7] .shstrtab STRTAB 00000000 000116 000049 00 0 0 1 [ 8] .symtab SYMTAB 00000000 0002f0 0000c0 10 9 7 4 [ 9] .strtab STRTAB 00000000 0003b0 000029 00 0 0 1 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings) I (info), L (link order), G (group), x (unknown) O (extra OS processing required) o (OS specific), p (processor specific)
The size of a single section header entry is 0x28, let’s use the following dumped information to show how the section entry is resolved.
0x000001b0: 1b 00 00 00 09 00 00 00 - 00 00 00 00 00 00 00 00 [@@@I@@@@@@@@@@@ 0x000001c0: dc 03 00 00 48 00 00 00 - 08 00 00 00 01 00 00 00 .C@@H@@@H@@@A@@@ 0x000001d0: 04 00 00 00 08 00 00 00
The offset of section name into Section Header String Table is 0x1b, therefore the name of current section is *(0x116+0x1b) = “.rel.text”.
The type this section is 0x9, according to the macro defined in ELF.h, we know that the type of current section saves relocation entries of relocation table.
/* Legal values for sh_type (section type). */ #define SHT_NULL 0 /* Section header table entry unused */ #define SHT_PROGBITS 1 /* Program data */ #define SHT_SYMTAB 2 /* Symbol table */ #define SHT_STRTAB 3 /* String table */ #define SHT_RELA 4 /* Relocation entries with addends */ #define SHT_HASH 5 /* Symbol hash table */ #define SHT_DYNAMIC 6 /* Dynamic linking information */ #define SHT_NOTE 7 /* Notes */ #define SHT_NOBITS 8 /* Program space with no data (bss) */ #define SHT_REL 9 /* Relocation entries, no addends */ #define SHT_SHLIB 10 /* Reserved */ #define SHT_DYNSYM 11 /* Dynamic linker symbol table */ #define SHT_INIT_ARRAY 14 /* Array of constructors */ #define SHT_FINI_ARRAY 15 /* Array of destructors */ #define SHT_PREINIT_ARRAY 16 /* Array of pre-constructors */ #define SHT_GROUP 17 /* Section group */ #define SHT_SYMTAB_SHNDX 18 /* Extended section indeces */ #define SHT_NUM 19 /* Number of defined types. */
Other data in the section header table is picked accordingly.
0x2f0-0x3af: Symbol Table and 0x3b0-0x3db: Symbol String Table
As the name implies, the symbol table is used to store the symbol information in ELF file format. The Symbol String Table stored the name to resolve the name of each entry in Symbol Table
Symbol table '.symtab' contains 12 entries: Num: Value Size Type Bind Vis Ndx Name 0: 00000000 0 NOTYPE LOCAL DEFAULT UND 1: 00000000 0 FILE LOCAL DEFAULT ABS test1.c 2: 00000000 0 SECTION LOCAL DEFAULT 1 3: 00000000 0 SECTION LOCAL DEFAULT 3 4: 00000000 0 SECTION LOCAL DEFAULT 4 5: 00000000 0 SECTION LOCAL DEFAULT 6 6: 00000000 0 SECTION LOCAL DEFAULT 5 7: 00000000 187 FUNC GLOBAL DEFAULT 1 main 8: 00000000 0 NOTYPE GLOBAL DEFAULT UND malloc 9: 00000000 0 NOTYPE GLOBAL DEFAULT UND strcpy 10: 00000000 0 NOTYPE GLOBAL DEFAULT UND strtoul 11: 00000000 0 NOTYPE GLOBAL DEFAULT UND free
The symbol table entry is defined as following:
typedef struct { Elf32_Word st_name; /* String table index of name. */ Elf32_Addr st_value; /* Symbol value. */ Elf32_Word st_size; /* Size of associated object. */ unsigned char st_info; /* Type and binding information. */ unsigned char st_other; /* Reserved (not used). */ Elf32_Half st_shndx; /* Section index of symbol. */ } Elf32_Sym; #define ELF32_ST_BIND(val) (((unsigned char) (val)) >> 4) #define ELF32_ST_TYPE(val) ((val) & 0xf) #define ELF32_ST_INFO(bind, type) (((bind) << 4) + ((type) & 0xf))
0x3dc-0x423: Relocation Table
At present, the information contained in relocation table is not important. After loading and linking, the relocation table will contain the function pointer that will be used by program.
Relocation section '.rel.text' at offset 0x3dc contains 9 entries: Offset Info Type Sym.Value Sym. Name 0000001c 00000802 R_386_PC32 00000000 malloc 00000038 00000902 R_386_PC32 00000000 strcpy 00000058 00000a02 R_386_PC32 00000000 strtoul 00000060 00000802 R_386_PC32 00000000 malloc 00000070 00000802 R_386_PC32 00000000 malloc 0000008c 00000902 R_386_PC32 00000000 strcpy 00000098 00000b02 R_386_PC32 00000000 free 000000a4 00000b02 R_386_PC32 00000000 free 000000b0 00000b02 R_386_PC32 00000000 free
Executable File
An executable file contains much more information than an object file. Due the page limit of the post, I will not display the whole dumped information via hex command. Instead, I will discuss some of them section by section.
ELF Header
Still we first display the ELF header information of the current executable file:
hex dump info: 0x00000000: 7f 45 4c 46 01 01 01 00 - 00 00 00 00 00 00 00 00 ?ELFAAA@@@@@@@@@ 0x00000010: 02 00 03 00 01 00 00 00 - d0 83 04 08 34 00 00 00 B@C@A@@@..DH4@@@ 0x00000020: 34 11 00 00 00 00 00 00 - 34 00 20 00 08 00 28 00 4Q@@@@@@4@ @H@(@ 0x00000030: 1e 00 1b 00 ELF Header: Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 Class: ELF32 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: EXEC (Executable file) Machine: Intel 80386 Version: 0x1 Entry point address: 0x80483d0 Start of program headers: 52 (bytes into file) Start of section headers: 4404 (bytes into file) Flags: 0x0 Size of this header: 52 (bytes) Size of program headers: 32 (bytes) Number of program headers: 8 Size of section headers: 40 (bytes) Number of section headers: 30 Section header string table index: 27
Section Header
Then we can resolve the information in section header as discussed above.
Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .interp PROGBITS 08048134 000134 000013 00 A 0 0 1 [ 2] .note.ABI-tag NOTE 08048148 000148 000020 00 A 0 0 4 [ 3] .note.gnu.build-i NOTE 08048168 000168 000024 00 A 0 0 4 [ 4] .hash HASH 0804818c 00018c 000034 04 A 6 0 4 [ 5] .gnu.hash GNU_HASH 080481c0 0001c0 000020 04 A 6 0 4 [ 6] .dynsym DYNSYM 080481e0 0001e0 000080 10 A 7 1 4 [ 7] .dynstr STRTAB 08048260 000260 000060 00 A 0 0 1 [ 8] .gnu.version VERSYM 080482c0 0002c0 000010 02 A 6 0 2 [ 9] .gnu.version_r VERNEED 080482d0 0002d0 000020 00 A 7 1 4 [10] .rel.dyn REL 080482f0 0002f0 000008 08 A 6 0 4 [11] .rel.plt REL 080482f8 0002f8 000030 08 A 6 13 4 [12] .init PROGBITS 08048328 000328 000030 00 AX 0 0 4 [13] .plt PROGBITS 08048358 000358 000070 04 AX 0 0 4 [14] .text PROGBITS 080483d0 0003d0 00020c 00 AX 0 0 16 [15] .fini PROGBITS 080485dc 0005dc 00001c 00 AX 0 0 4 [16] .rodata PROGBITS 080485f8 0005f8 000008 00 A 0 0 4 [17] .eh_frame PROGBITS 08048600 000600 000004 00 A 0 0 4 [18] .ctors PROGBITS 08049f0c 000f0c 000008 00 WA 0 0 4 [19] .dtors PROGBITS 08049f14 000f14 000008 00 WA 0 0 4 [20] .jcr PROGBITS 08049f1c 000f1c 000004 00 WA 0 0 4 [21] .dynamic DYNAMIC 08049f20 000f20 0000d0 08 WA 7 0 4 [22] .got PROGBITS 08049ff0 000ff0 000004 04 WA 0 0 4 [23] .got.plt PROGBITS 08049ff4 000ff4 000024 04 WA 0 0 4 [24] .data PROGBITS 0804a018 001018 000008 00 WA 0 0 4 [25] .bss NOBITS 0804a020 001020 000008 00 WA 0 0 4 [26] .comment PROGBITS 00000000 001020 000025 01 MS 0 0 1 [27] .shstrtab STRTAB 00000000 001045 0000ee 00 0 0 1 [28] .symtab SYMTAB 00000000 0015e4 000440 10 29 45 4 [29] .strtab STRTAB 00000000 001a24 000232 00 0 0 1
Program Segment
Program segment table displays the mapping property of each section in the executable file:
typedef struct { Elf32_Word p_type; /* Segment type */ Elf32_Off p_offset; /* Segment file offset */ Elf32_Addr p_vaddr; /* Segment virtual address */ Elf32_Addr p_paddr; /* Segment physical address */ Elf32_Word p_filesz; /* Segment size in file */ Elf32_Word p_memsz; /* Segment size in memory */ Elf32_Word p_flags; /* Segment flags */ Elf32_Word p_align; /* Segment alignment */ } Elf32_Phdr;
Here please pay attention to the flag of the memory.
R: Readable
W: Writable
E: Executable
Thus we can use the mapping relation to check the status of memory address.
0x00000034: 06 00 00 00 - 34 00 00 00 34 80 04 08 ^@[@F@@@4@@@4.DH 0x00000040: 34 80 04 08 00 01 00 00 - 00 01 00 00 05 00 00 00 4.DH@A@@@A@@E@@@ 0x00000050: 04 00 00 00 03 00 00 00 - 34 01 00 00 34 81 04 08 D@@@C@@@4A@@4.DH 0x00000060: 34 81 04 08 13 00 00 00 - 13 00 00 00 04 00 00 00 4.DHS@@@S@@@D@@@ 0x00000070: 01 00 00 00 01 00 00 00 - 00 00 00 00 00 80 04 08 A@@@A@@@@@@@@.DH 0x00000080: 00 80 04 08 04 06 00 00 - 04 06 00 00 05 00 00 00 @.DHDF@@DF@@E@@@ 0x00000090: 00 10 00 00 01 00 00 00 - 0c 0f 00 00 0c 9f 04 08 @P@@A@@@LO@@L.DH 0x000000a0: 0c 9f 04 08 14 01 00 00 - 1c 01 00 00 06 00 00 00 L.DHTA@@\A@@F@@@ 0x000000b0: 00 10 00 00 02 00 00 00 - 20 0f 00 00 20 9f 04 08 @P@@B@@@ O@@ .DH 0x000000c0: 20 9f 04 08 d0 00 00 00 - d0 00 00 00 06 00 00 00 .DH.@@@.@@@F@@@ 0x000000d0: 04 00 00 00 04 00 00 00 - 48 01 00 00 48 81 04 08 D@@@D@@@HA@@H.DH 0x000000e0: 48 81 04 08 44 00 00 00 - 44 00 00 00 04 00 00 00 H.DHD@@@D@@@D@@@ 0x000000f0: 04 00 00 00 51 e5 74 64 - 00 00 00 00 00 00 00 00 D@@@Q.td@@@@@@@@ 0x00000100: 00 00 00 00 00 00 00 00 - 00 00 00 00 06 00 00 00 @@@@@@@@@@@@F@@@ 0x00000110: 04 00 00 00 52 e5 74 64 - 0c 0f 00 00 0c 9f 04 08 D@@@R.tdLO@@L.DH 0x00000120: 0c 9f 04 08 f4 00 00 00 - f4 00 00 00 04 00 00 00 L.DH.@@@.@@@D@@@ 0x00000130: 01 00 00 00 A@@@ Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align PHDR 0x000034 0x08048034 0x08048034 0x00100 0x00100 R E 0x4 INTERP 0x000134 0x08048134 0x08048134 0x00013 0x00013 R 0x1 LOAD 0x000000 0x08048000 0x08048000 0x00604 0x00604 R E 0x1000 LOAD 0x000f0c 0x08049f0c 0x08049f0c 0x00114 0x0011c RW 0x1000 DYNAMIC 0x000f20 0x08049f20 0x08049f20 0x000d0 0x000d0 RW 0x4 NOTE 0x000148 0x08048148 0x08048148 0x00044 0x00044 R 0x4 GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4 GNU_RELRO 0x000f0c 0x08049f0c 0x08049f0c 0x000f4 0x000f4 R 0x1 Section to Segment mapping: Segment Sections... 00 01 .interp 02 .interp .note.ABI-tag .note.gnu.build-id .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame 03 .ctors .dtors .jcr .dynamic .got .got.plt .data .bss 04 .dynamic 05 .note.ABI-tag .note.gnu.build-id 06 07 .ctors .dtors .jcr .dynamic .got
Relocation Table and Dynamic Link
Here, we finally reach the key part of this post. We will discuss the .rel.plt, .plt and .got.plt. And how they will be utilised in exploitation development.
0x804a000-0x804a017: .rel.plt section
.rel.plt table contains the address of the unresolved function.
0x000002f0: 00 a0 04 08 07 01 00 00 ..DHFA@@@.DHGA@@ 0x00000300: 04 a0 04 08 07 02 00 00 - 08 a0 04 08 07 03 00 00 D.DHGB@@H.DHGC@@ 0x00000310: 0c a0 04 08 07 04 00 00 - 10 a0 04 08 07 05 00 00 L.DHGD@@P.DHGE@@ 0x00000320: 14 a0 04 08 07 06 00 00 T.DHGF@@U..S..D. Relocation section '.rel.plt' at offset 0x2f8 contains 6 entries: Offset Info Type Sym.Value Sym. Name 0804a000 00000107 R_386_JUMP_SLOT 00000000 __gmon_start__ 0804a004 00000207 R_386_JUMP_SLOT 00000000 __libc_start_main 0804a008 00000307 R_386_JUMP_SLOT 00000000 free 0804a00c 00000407 R_386_JUMP_SLOT 00000000 strtoul 0804a010 00000507 R_386_JUMP_SLOT 00000000 strcpy 0804a014 00000607 R_386_JUMP_SLOT 00000000 malloc
0x8048358-0x80483c7: .plt section
.plt contains the assembly code that will be used for dynamic link. At the level of assembly code
08048358 <__gmon_start__@plt-0x10>: 8048358: ff 35 f8 9f 04 08 pushl 0x8049ff8 804835e: ff 25 fc 9f 04 08 jmp *0x8049ffc 8048364: 00 00 add %al,(%eax) 08048368 <__gmon_start__@plt>: 8048368: ff 25 00 a0 04 08 jmp *0x804a000 804836e: 68 00 00 00 00 push $0x0 8048373: e9 e0 ff ff ff jmp 8048358 <_init+0x30> 08048378 <__libc_start_main@plt>: 8048378: ff 25 04 a0 04 08 jmp *0x804a004 804837e: 68 08 00 00 00 push $0x8 8048383: e9 d0 ff ff ff jmp 8048358 <_init+0x30> 08048388 <free@plt>: 8048388: ff 25 08 a0 04 08 jmp *0x804a008 804838e: 68 10 00 00 00 push $0x10 8048393: e9 c0 ff ff ff jmp 8048358 <_init+0x30> 08048398 <strtoul@plt>: 8048398: ff 25 0c a0 04 08 jmp *0x804a00c 804839e: 68 18 00 00 00 push $0x18 80483a3: e9 b0 ff ff ff jmp 8048358 <_init+0x30> 080483a8 <strcpy@plt>: 80483a8: ff 25 10 a0 04 08 jmp *0x804a010 80483ae: 68 20 00 00 00 push $0x20 80483b3: e9 a0 ff ff ff jmp 8048358 <_init+0x30> 080483b8 <malloc@plt>: 80483b8: ff 25 14 a0 04 08 jmp *0x804a014 80483be: 68 28 00 00 00 push $0x28 80483c3: e9 90 ff ff ff jmp 8048358 <_init+0x30>
0x8049ff4-0x804a017: .got.plt section
To understand how the values in .got.plt are used in target program. Let’s observe how the values are changed during execution.
//At the beginning of program: (gdb) x/8x 0x804a000 0x804a000: 0x0804836e 0x00144b10 0x0804838e 0x0804839e 0x804a010: 0x080483ae 0x080483be 0x00000000 0x00000000 //After the first malloc in called: //the value at 0x804a014 is changed to the address of malloc in libc (gdb) x/8x 0x804a000 0x804a000: 0x0804836e 0x00144b10 0x0804838e 0x0804839e 0x804a010: 0x080483ae 0x001a0ae0 0x00000000 0x00000000 //At the end of program: //other value at 0x804a008, 0x804a00c, 0x804a010 are all changed. (gdb) x/8x 0x804a000 0x804a000: 0x0804836e 0x00144b10 0x001a0a00 0x0015e310 0x804a010: 0x001a3e00 0x001a0ae0 0x00000000 0x00000000
Lazy Binding
The process above is called lazy binding for resolving the address of target function at runtime to improve the performance. To explain the full process of symbol resolving, we take malloc as an example below.
//when malloc() is called at first time: 804849f: e8 14 ff ff ff call 80483b8 <malloc@plt> //the control flow diverts to 0x80483b8, which locates in .plt 80483b8: ff 25 14 a0 04 08 jmp *0x804a014 80483be: 68 28 00 00 00 push $0x28 80483c3: e9 90 ff ff ff jmp 8048358 <_init+0x30> //at this time it will divert to *0x804a014, which locates in .got.plt 0x804a000: 0x0804836e 0x00144b10 0x0804838e 0x0804839e 0x804a010: 0x080483ae 0x080483be 0x00000000 0x00000000 the data at 0x804a014 is 0x80483be, thus the control flow jumps to 0x80483be 0x28 is the index of malloc in the .rel.plt then it goes into the routine to resolve the function symbol (gdb) x/2i 0x8048358 0x8048358: pushl 0x8049ff8 0x804835e: jmp *0x8049ffc (gdb) x/4wx 0x8049ffc 0x8049ffc: 0x00123270 0x0804836e 0x00144b10 0x001a0a00 (gdb) x/11i 0x123270 0x123270 <_dl_runtime_resolve>: push %eax 0x123271 <_dl_runtime_resolve+1>: push %ecx 0x123272 <_dl_runtime_resolve+2>: push %edx 0x123273 <_dl_runtime_resolve+3>: mov 0x10(%esp),%edx 0x123277 <_dl_runtime_resolve+7>: mov 0xc(%esp),%eax 0x12327b <_dl_runtime_resolve+11>: call 0x11d5a0 <_dl_fixup> 0x123280 <_dl_runtime_resolve+16>: pop %edx 0x123281 <_dl_runtime_resolve+17>: mov (%esp),%ecx 0x123284 <_dl_runtime_resolve+20>: mov %eax,(%esp) 0x123287 <_dl_runtime_resolve+23>: mov 0x4(%esp),%eax 0x12328b <_dl_runtime_resolve+27>: ret $0xc
In the final _dl_runtime_resolve is called to resolve the symbol information and modify the value stored at 0x804a014.
GOT Table Hijacking
Given all the information above, we can now discuss how lazy binding is utilised in exploitation.
Recall the information given in program header section. We can see that the .got.plt is located in an area, which is readable and writeable. So if an attacker can corrupt the data in .got.plt, the attacker can hijack the control flow via calling the original function.
To prevent the attacker hijacking control flow via corrupting data in .got.plt, we can compile the executable binary with option “-Wl,-z,relro,-z,now”. This will enable the linker to resolve all the function at loading time and set the .got.plt section as read-only.
Conclusion
In this post, it takes a lot of time and effort of me to explain the details of ELF format in Unix. Actually, for a tutorial on exploitation challenge, the last part of this post is enough. But I still wish to give more details in ELF file format resolving. Based on those information, we may discuss more exploitation techniques in future 🙂
Reference
[1]http://www.iecc.com/linker/
[2]http://www.xfocus.net/articles/200201/337.html
[3]http://flint.cs.yale.edu/cs422/doc/ELF_Format.pdf
[4]https://polimicg.org/pulp/git/pulp-public/eld.git/tree/
[5]http://blog.fpmurphy.com/2008/06/position-independent-executables.html
[6]http://repo.or.cz/w/glibc.git/blob/HEAD:/include/link.h
[…] Exploitation Technique: ret2-dl-resolve Since in my previous post Dynamic Link, I leave a clue in conclusion part that lazy binding could be used for exploitation. Here I will […]
LikeLiked by 1 person
[…] the beginning of the tutorial, I introduce the ELF Format and mention that there exist some exploitation techniques on that. In this post, I will introduce […]
LikeLiked by 1 person
[…] very unique feature shown as following. More details about PLT can be found in my previous POST: ELF file format And the address of the PLT could be found by repeating searching the […]
LikeLike
[…] the double-free. For a successful exploitation, unfamiliar readers can read Thread Cache and Lazy bind […]
LikeLike