Traditional deep learning-based models have achieved promising results in medical image segmentation. However, their performance degrades severely when applied to unseen domains due to variations in imaging protocols, acquisition devices, and patient populations across medical centers, which lead to significant distribution shifts. With the emergence of the Segment Anything Model (SAM), a single model now exhibits significantly improved generalization and adaptability to various image types. Nevertheless, while SAM has learned structure representations from large-scale natural images, it lacks fine-grained structural knowledge specific to the medical imaging domain, remaining relatively invariant across imaging domains. In addition, its structural enhancement is vulnerable to unreliable prompts, and patch-wise inference disrupts structural continuity, leading to suboptimal performance in capturing anatomical details. To address this, we propose a novel Medical Fine-grained Segment Anything Model (termed MedFineSAM), which integrates three key modules: shared fine-grained structural enhancement, which extracts and selectively enhances fine-grained structural features shared between prompts and image embeddings via a structural dictionary; a prompt gating mechanism, which estimates prompt confidence and dynamically adjusts prompt weights to avoid erroneous enhancement; and a structural continuity diffusion in frequency domain (SCFD), which performs frequency-domain smoothing during decoding to alleviate structural discontinuity caused by patch aggregation. Experiments on the fundus benchmark and prostate MRI benchmark demonstrate superior generalization performance, offering new insights into leveraging SAM for single-source domain generalization in medical image segmentation.
Ba et al. (Tue,) studied this question.