Introduction
In this paper, the author finds that, besides the running status, the non-code descriptions in CVE and Linux git logs can also help the fuzzer to avoid unnecessary runs, saving a lot of time in the fuzzing process. In particular, we use the semantics-based approach (e.g., NLP) to automatically analyse the description and extract necessary information for feeding to the fuzzer.
Semantic Information Retrieving
Generating parse tree
Retrieving affected version
Retrieving vulnerability type
Retrieving vulnerability function
Retrieving critical variables
Retrieving system call
Generating parse tree
Fuzzing Strategy
Coarse-grained Mutation
Fine-grained Mutation
Given a vulnerable function f ∈ VUL with an entry point e. Suppose the patched code of function f is in the set of basic blocks PATCH = {p1,p2, …,pn }. Let KCOVB(s) be the set of covered basic blocks in a system call sequence s. For ∀b ∈ KCOVB(s), we define its priority prio0 to a patched block p ∈ PATCH as:
Performance Evaluation
CVE | SemFuzz | Syzkaller | found by | POC |
---|---|---|---|---|
CVE-2015-0275 | 3.31h | >48h | Xiong Zhou | X |
CVE-2015-1333 | 8.17h | 37.26h | Tyler Hicks | X |
CVE-2015-5706 | 0.10h | >48h | Al Viro | X |
CVE-2015-6937 | 11.64h | >48h | Vender | X |
CVE-2015-7872 | 11.32h | 27.61h | Syzkaller | POC |
CVE-2015-7990 | 4.72h | 21.54h | Sasha | X |
CVE-2016-0728 | 6.97h | 42.81h | PPT | POC |
CVE-2016-10147 | 31.96h | >48h | Vladis Dronov | POC |
CVE-2016-3134 | 29.35h | >48h | Hawkes | POC |
CVE-2016-3841 | 9.44h | >48h | Syzkaller | POC |
CVE-2016-4482 | 0.04h | >48h | Kangjie Lu | X |
CVE-2016-4794 | 5.51h | 26.84h | Syzkaller | POC |
CVE-2016-6213 | 16.53h | >48h | CAI Qian | X |
CVE-2016-8646 | 38.29h | >48h | Igor Redko | X |
CVE-2016-9793 | 17.05h | >48h | Syzkaller | POC |
CVE-2016-9794 | 23.16h | >48h | Baozeng Ding | X |
CVE-2017-6074 | 10.91h | 39.12h | Syzkaller | POC |
CVE-2017-6347 | 7.76h | 41.83h | Syzkaller | POC |