What question did this study set out to answer?

March 18, 2026

Automatically Checking Semantic Equivalence between Versions of Large-Scale C Projects

Key Points

To develop a scalable method for verifying semantic equivalence of different versions of C projects, especially the Linux kernel.
Utilized pattern matching combined with static analysis and control-flow transformations.
Focused on APIs and global variables to ensure semantic consistency.
Implemented a specialized slicing procedure to analyze relevant code efficiently.
Developed within the LLVM infrastructure using a tool named DiffKemp.
Able to compare thousands of functions in minutes with minimal false non-equality outcomes.
Demonstrated practical usability on large projects like the Linux kernel, outperforming existing tools.

Abstract

Motivated by existence of software projects that undergo regular refactorings and modifications and yet need to ensure semantic stability of some of their core parts, we propose a highly-scalable approach for automatically checking semantic equivalence of different versions of large-scale, real-world C projects, with a particular (though not exclusive) focus on the Linux kernel. The proposed method uses a novel combination of pattern matching with light-weight static analysis and control-flow transformations. The method checks preservation of the semantics of functions forming the API of the project being analysed as well as of the semantics of its global variables, which typically hold various control parameters. For the latter, a specialised slicing procedure is proposed to slice out code influenced by these variables and concentrate the analysis on that code only. Although the method cannot prove equivalence on heavily refactored code, it can compare thousands of functions in the order of minutes while producing a low number of false non-equality verdicts as our experiments show. The method has been implemented over the LLVM infrastructure in a tool called DiffKemp . Our results show that DiffKemp , unlike other existing tools, gives practically useful results even on projects of the size of the Linux kernel.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Viktor Malík

Tomáš Vojnar

František Nečas

Journals

ACM Transactions on Software Engineering and Methodology

Actions

Institutions

Masaryk University

Brno University of Technology

Red Hat (United States)

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Automatically Checking Semantic Equivalence between Versions of Large-Scale C Projects

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study