Digital stylization of Dunhuang murals can support cultural heritage revitalization by transferring their distinctive aesthetics to modern images, but existing methods face practical limitations. Transformer-based models can yield high visual quality, but often at a prohibitive computational cost. In contrast, standard state space models (SSMs) are more efficient but tend to incur issues such as semantic loss, inconsistent stylization, and an undesired coupling between color and structure when processing the complex textures of historical murals. To address these issues, we propose Dh-Mamba, a hierarchical visual Mamba framework tailored for high-fidelity Dunhuang mural style transfer. Dh-Mamba introduces a CrossMamba in-context style injection mechanism. This mechanism prefixes the style token sequence to the content sequence, which enables globally consistent style propagation as a persistent memory and retains linear-time efficiency. We also designed two additional components: a Modulated Style Perception Module (Δt) and an Orthogonal Decoupled HSV Modulator. The former adaptively regulates texture injection based on style complexity. The latter models mineral pigment palettes and mitigates oxidation-related artifacts by disentangling hue, saturation, and value. Experiments on a custom Dunhuang dataset show that Dh-Mamba improves content preservation and produces more natural mural textures than recent state-of-the-art methods; multiple quantitative metrics corroborate these gains. With 20.04 million parameters, Dh-Mamba provides a resource-efficient solution suitable for deployment in resource-constrained terminal applications for cultural heritage preservation.
Qin et al. (Thu,) studied this question.