This paper presents a comprehensive, multi-metric evaluation of four sorting algorithmsrecursive merge sort, hybrid mergeinsertion sort with fixed and dynamic thresholds, and Pythons built-in Timsortas well as two binary search tree implementationsclassic unbalanced BST and self-balancing AVL treein a pure Python environment. Benchmarking spans four input distributions (random, sorted, reverse-sorted, and 25% perturbed) and four scales from103to106elements, jointly measuring wall-clock latency, memory fragmentation, and CPU/DRAM energy. The fixed-threshold hybrid (K=22) consistently outperforms pure merge by 1218%, while a dynamicnthreshold tracks within 510% without manual tuning. Instrumenting the AVL tree to count rotations yields an empirical slope of2.13n, and confirms stableO(logn)behavior even when worst-case ordered inputs drive an unbalanced BST to heightO(n)and quadratic total cost. Memory fragmentation drops by about 30% in the hybrid methods compared to40%for merge, with energy savings of 1520%; Timsort attains the lowest fragmentation and energy footprint among the methods evaluated. An automated Python-based benchmarking framework provides a reproducible blueprint for high-level algorithm analysis under realistic workloads.
Zihan Yang (Tue,) studied this question.