Understanding the temporal order of driver gene mutations is essential for modeling cancer progression and improving diagnostics. In colorectal cancer, tumor development is a multistep evolutionary process, yet mutation ordering varies across samples. This thesis proposes a graph-based approach for identifying mutation order relationships using binary mutation data. Directed edges are established based on statistical support calculated across 200 bootstrap datasets. Cycles are removed using multiple support strategies to construct a Directed Acyclic Graph (DAG), from which the longest path identifies the most likely mutation sequence. An alternative correlation-based graph is generated for comparison. The inferred mutation orders are evaluated against other approaches, including mutation frequency, established order score method, the densest subgraph, correlation-based orders, and generative AI (ChatGPT). Results show that the bootstrap-based DAG method effectively captures meaningful mutation relationships while accommodating diverse tumor evolutionary patterns, providing a robust framework for studying colorectal cancer progression.
Ekene Okeke (Mon,) studied this question.