Motivated by existence of software projects that undergo regular refactorings and modifications and yet need to ensure semantic stability of some of their core parts, we propose a highly-scalable approach for automatically checking semantic equivalence of different versions of large-scale, real-world C projects, with a particular (though not exclusive) focus on the Linux kernel. The proposed method uses a novel combination of pattern matching with light-weight static analysis and control-flow transformations. The method checks preservation of the semantics of functions forming the API of the project being analysed as well as of the semantics of its global variables, which typically hold various control parameters. For the latter, a specialised slicing procedure is proposed to slice out code influenced by these variables and concentrate the analysis on that code only. Although the method cannot prove equivalence on heavily refactored code, it can compare thousands of functions in the order of minutes while producing a low number of false non-equality verdicts as our experiments show. The method has been implemented over the LLVM infrastructure in a tool called DiffKemp . Our results show that DiffKemp , unlike other existing tools, gives practically useful results even on projects of the size of the Linux kernel.
Building similarity graph...
Analyzing shared references across papers
Loading...
Viktor Malík
Tomáš Vojnar
František Nečas
ACM Transactions on Software Engineering and Methodology
Masaryk University
Brno University of Technology
Red Hat (United States)
Building similarity graph...
Analyzing shared references across papers
Loading...
Malík et al. (Mon,) studied this question.
www.synapsesocial.com/papers/69ba428e4e9516ffd37a2f5c — DOI: https://doi.org/10.1145/3801958