What question did this study set out to answer?

The aim is to enhance the quantification of multicomponent textile blends using near-infrared spectroscopy by developing a chemically guided Vision Transformer model.

May 8, 2026

A chemically guided Vision Transformer for interpretable near-infrared spectral analysis of multicomponent textile blends

Key Points

The aim is to enhance the quantification of multicomponent textile blends using near-infrared spectroscopy by developing a chemically guided Vision Transformer model.
Proposed a chemically guided Vision Transformer (CG-ViT) integrating chemical information into its architecture.
Evaluated on 533 textile samples with 2665 spectra collected to assess performance.
Employed ablation experiments and attention analysis to confirm model contributions and interpretability.
Achieved an overall RMSE of 0.7675 and R² of 0.999, outperforming existing models like PLSR and CNN.
Component-wise RMSEs were 0.9443 for cotton, 0.8618 for polyester, and 0.3644 for spandex, all with R² ⩾ 0.998.
Demonstrated robust performance under varied conditions with R² ⩾ 0.97, indicating strong resilience against instrumental variations.

Abstract

Accurate quantification of multicomponent textile blends by near-infrared (NIR) spectroscopy is hindered by overlapping absorption bands, baseline variation, and nonlinear interactions between fiber components. To address these challenges, we propose a chemically guided Vision Transformer (CG-ViT) that incorporates domain-specific chemical information—such as characteristic absorption peaks and functional group signatures—into the attention mechanism to improve predictive performance and ensure chemically grounded interpretability. The architecture combines a convolutional patch embedding to preserve local spectral continuity, a chemical-prior mask to emphasize wavelength regions of known relevance, and a gating module to adaptively control the influence of these priors for each sample. The approach was evaluated on 533 cotton–polyester–spandex textile samples, each measured at five independent positions, yielding 2665 spectra in total. CG-ViT achieved an overall RMSE of 0.7675 and R ² of 0.999 on the test set, with component-wise RMSEs of 0.9443 for cotton, 0.8618 for polyester, and 0.3644 for spandex, all with R ² ⩾ 0.998. These results exceeded the performance of established baselines, including PLSR, CNN, ANN, and standard 1D-ViT models. Ablation experiments verified the contribution of each module, and attention analysis demonstrated clear correspondence between model focus and chemically meaningful spectral features. Robustness assessments under cross-validation, Gaussian noise, and baseline drift conditions showed stable accuracy ( R ² ⩾ 0.97), indicating strong resilience to typical instrumental variations and supporting the method’s application to reliable, real-time, nondestructive textile composition analysis.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Changjiang Wan

Cuiping Yu

Laihu Peng

Journals

Textile Research Journal

Actions

Institutions

Zhejiang Sci-Tech University

Hangzhou Dianzi University

Zhejiang Lab

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

A chemically guided Vision Transformer for interpretable near-infrared spectral analysis of multicomponent textile blends

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study