Generating Precise Error Specifications for C: A Zero Shot Learning Approach
In C programs, error specifications, which specify the value range that each function returns to indicate failures, are widely used to check and propagate errors for the sake of reliability and security. Various kinds of C analyzers employ error specifications for different purposes, e.g., to detect error handling bugs, yet a general approach for generating precise specifications is still missing. This limits the applicability of those tools.
In this paper, we solve this problem by developing a machine learning-based approach named MLPEx. It
generates error specifications by analyzing only the source code, and is thus general. We propose a novel machine learning paradigm based on transfer learning, enabling MLPEx to require only one-time minimal data labeling from us (as the tool developers) and zero manual labeling efforts from users. To improve the accuracy of generated error specifications, MLPEx extracts and exploits project-specific information. We evaluate MLPEx on 10 projects, including 6 libraries and 4 applications. An investigation of 3,443 functions and 17,750 paths reveals that MLPEx generates error specifications with a precision of 91% and a recall of 94%, significantly higher than those of state-of-the-art approaches. To further demonstrate the usefulness of the generated error specifications, we use them to detect 57 bugs in 5 tested projects.
Wed 23 OctDisplayed time zone: Beirut change
14:00 - 15:30 | |||
14:00 22mTalk | Duet: An Expressive Higher-Order Language and Linear Type System for Statically Enforcing Differential Privacy OOPSLA Joseph P. Near University of Vermont, David Darais University of Vermont, Chike Abuah University of Vermont, Tim Stevens University of Vermont, Pranav Gaddamadugu University of California, Berkeley, Lun Wang University of California, Berkeley, Neel Somani University of California, Berkeley, Mu Zhang University of Utah, Nikhil Sharma University of California, Berkeley, Alex Shan University of California, Berkeley, Dawn Song University of California, Berkeley DOI | ||
14:22 22mTalk | Improving Bug Detection via Context-Based Code Representation Learning and Attention-Based Neural Networks OOPSLA Yi Li New Jersey Institute of Technology, USA, Shaohua Wang New Jersey Institute of Technology, USA, Tien N. Nguyen University of Texas at Dallas, Son Nguyen The University of Texas at Dallas DOI | ||
14:45 22mTalk | Probabilistic Verification of Fairness Properties via Concentration OOPSLA Osbert Bastani University of Pennsylvania, Xin Zhang Massachusetts Institute of Technology, Armando Solar-Lezama Massachusetts Institute of Technology DOI | ||
15:07 22mTalk | Generating Precise Error Specifications for C: A Zero Shot Learning Approach OOPSLA Baijun Wu University of Louisiana at Lafayette, John Peter Campora University of Louisiana at Lafayette, He Yi University of Louisiana at Lafayette, Alexander Schlecht University of Louisiana at Lafayette, Sheng Chen University of Louisiana at Lafayette DOI |