What question did this study set out to answer?

The aim is to improve the visual quality and structural alignment of garments in virtual try-on settings.

April 10, 2026Open Access

StageAttn-VTON: Stage-Wise Flow Deformation with Attention for High-Resolution Virtual Try-On

Key Points

The aim is to improve the visual quality and structural alignment of garments in virtual try-on settings.
Developed StageAttn-VTON for garment warping in three stages: coarse alignment, local refinement, and non-target region removal.
Introduced a self-attention module in the synthesis stage to model global dependencies.
Evaluated performance using VITON-HD and DressCode upper-body subset datasets.
StageAttn-VTON consistently outperforms warping-based and diffusion-based methods.
The framework effectively reduces deformation artifacts in complex garment areas, such as sleeves and waist.

Abstract

Virtual try-on is a key enabling technology for online fashion retail and digital garment visualization. It aims to realistically render a target garment on a person while preserving geometric alignment and fine texture details. Appearance flow-based approaches provide explicit deformation modeling but often suffer from texture squeezing and boundary artifacts in challenging scenarios, such as long sleeves and tucked-in garments, especially under high-resolution settings. In this work, we propose StageAttn-VTON (Stage-wise Attentive Virtual Try-On), an appearance flow-based framework that improves structural coherence and visual fidelity through stage-wise deformation modeling. Specifically, garment warping is decomposed into three stages—coarse alignment, local refinement, and non-target region removal—which mitigates the coupling between competing objectives, such as smooth texture preservation and accurate structural alignment. Furthermore, we introduce a self-attention module in the image synthesis stage to enhance global dependency modeling and capture long-range garment–body interactions. Experiments on VITON-HD and the upper-body subset of DressCode demonstrate that StageAttn-VTON achieves consistently strong performance against representative warping-based and diffusion-based baselines. In addition, qualitative comparisons show that the proposed method better alleviates deformation artifacts in challenging regions such as sleeves and waist areas.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Li Yao

Wang Liang

Yan Wan

Journals

Applied Sciences

Actions

Institutions

Donghua University

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

StageAttn-VTON: Stage-Wise Flow Deformation with Attention for High-Resolution Virtual Try-On

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study