The precise 3D delineation of abdominal organs and tumor lesions from Computed Tomography (CT) imaging serves as a fundamental cornerstone for modern computer-assisted diagnostic workflows, significantly alleviating the manual annotation burden on radiologists. Although Convolutional Neural Networks (CNNs) demonstrate exceptional capability in processing local textural details, their inherent receptive field constraints often hinder the effective modeling of long-range anatomical dependencies, which are crucial for understanding complex organ relationships. In contrast, while Transformer-based architectures offer superior global context modeling, their quadratic computational complexity (O(N 2 )) imposes substantial barriers to processing high-resolution volumetric data, often requiring expensive hardware resources. Lately, Structured State Space Models (SSMs), exemplified by Mamba, have emerged as robust alternatives, delivering linear complexity (O(N)). Nevertheless, current hybrid frameworks typically employ simplistic fusion mechanisms, resulting in a phenomenon we identify as semantic misalignment, where distinct background noise from the global branch obscures the faint, subtle signals of small lesions. To surmount this technical bottleneck, we introduce Micro-Mamba, a robust and potent architecture engineered to efficiently synergize local and global features without incurring prohibitive computational costs. The core of our innovation is the Dual-Branch Gating mechanism positioned at the network bottleneck. This novel module leverages global context derived from the Mamba branch to generate dynamic gating signals, which adaptively modulate highfrequency features extracted by the parallel CNN branch. Acting as a comprehensive semantic filter, it suppresses irrelevant background noise while accentuating precise target boundaries. Extensive evaluations on the public MSD Liver (Task03) and Pancreas (Task07) datasets validate the superiority of our method. Micro-Mamba achieves State-of-the-Art (SOTA) results on liver segmentation with a Mean Dice of 82.88% and robust boundary delineation (HD95 6.12 mm), significantly surpassing existing baselines. Crucially, stress tests against the latest 2025 Mamba models (e.g., RMA-Mamba) under resource-constrained settings demonstrate that Micro-Mamba achieves dramatically superior boundary precision (Mean HD95 22.4 mm vs. 56.4 mm) and significantly higher stability (Tumor Dice Std 0.90 vs. 24.04). Moreover, it exhibits strong zero-shot generalization on the demanding pancreas task (Mean Dice 67.1%). Remarkably, owing to the linear complexity of Mamba blocks, our model is fully trainable on a single consumer-grade GPU (NVIDIA RTX 4090), making it highly deployable in clinical settings.
Feng et al. (Sat,) studied this question.