The Linux kernel is a highly complex software system that offers over 14,000 configuration options. As exhaustive testing of the Linux kernel is infeasible, automated integration testing systems, such as Intel’s Linux Kernel Performance (LKP) system, utilize samples (i.e., sets of configurations) for testing. In this thesis,2 we develop tooling to acquire the first comprehensive LKP build test dataset, comprising over 180,000 test results and configurations from a seven-year timeframe. Additionally, we automatically reproduce over 15,000 LKP tests and compile over 4,000 kernel configurations. We find that 90% of tests are irreproducible due to an inconsistent build process and inaccessible data. Furthermore, we observe that different configuration types used by LKP vary in required testing effort and their effectiveness in finding defects: 45% of tested configurations find 91% of defects. Based on our results, we offer recommendations to improve LKP and similar testing systems
Christopher Rau (Thu,) studied this question.