Virtual try-on is a key enabling technology for online fashion retail and digital garment visualization. It aims to realistically render a target garment on a person while preserving geometric alignment and fine texture details. Appearance flow-based approaches provide explicit deformation modeling but often suffer from texture squeezing and boundary artifacts in challenging scenarios, such as long sleeves and tucked-in garments, especially under high-resolution settings. In this work, we propose StageAttn-VTON (Stage-wise Attentive Virtual Try-On), an appearance flow-based framework that improves structural coherence and visual fidelity through stage-wise deformation modeling. Specifically, garment warping is decomposed into three stages—coarse alignment, local refinement, and non-target region removal—which mitigates the coupling between competing objectives, such as smooth texture preservation and accurate structural alignment. Furthermore, we introduce a self-attention module in the image synthesis stage to enhance global dependency modeling and capture long-range garment–body interactions. Experiments on VITON-HD and the upper-body subset of DressCode demonstrate that StageAttn-VTON achieves consistently strong performance against representative warping-based and diffusion-based baselines. In addition, qualitative comparisons show that the proposed method better alleviates deformation artifacts in challenging regions such as sleeves and waist areas.
Building similarity graph...
Analyzing shared references across papers
Loading...
Li Yao
Wang Liang
Yan Wan
Applied Sciences
Donghua University
Building similarity graph...
Analyzing shared references across papers
Loading...
Yao et al. (Tue,) studied this question.
www.synapsesocial.com/papers/69d896a46c1944d70ce0838b — DOI: https://doi.org/10.3390/app16073609