What question did this study set out to answer?

This study aims to address critical challenges in packaging appearance defect detection by developing an innovative network.

April 18, 2026

Multi-source information fusion in packaging appearance defect detection

Key Points

This study aims to address critical challenges in packaging appearance defect detection by developing an innovative network.
Developed the PackNet-MF network using a dual-branch Swin-Transformer backbone.
Implemented the Cross-modal Attention Fusion Mechanism (CMA) for feature alignment.
Applied the Defect Context Awareness Mechanism (DCA) to reduce texture interference.
Utilized the Multi-scale Spatial Adaptive Aggregation Mechanism (MS-AA) for detecting defects across various scales.
Achieved an F1-score of 0.9012 on the Pack-Defect dataset, significantly improving upon U-Net.
Demonstrated a mean Intersection over Union (mIoU) of 0.8473 for tiny defect localization.
Exhibited strong generalization abilities in experiments on the NEU-Seg dataset.

Abstract

Aiming at the key problems in packaging appearance defect detection, such as difficult cross-modal feature alignment, strong interference from printed textures, and poor compatibility with multi-scale defects, this paper proposes a multi-modal defect detection network (PackNet-MF) that fuses RGB images and PCD (Point Cloud Data). This study proposes the PackNet-MF network with a dual-branch Swin-Transformer as its backbone, combined with three core innovative modules to form a targeted solution: the Cross-modal Attention Fusion Mechanism (CMA) addresses deformation issues through dynamic feature alignment, and its adaptive weight adjustment capability enables more accurate matching of multi-modal data; the Defect Context Awareness Mechanism (DCA) effectively suppresses texture interference, achieving precise distinction between defects and interference via semantic modeling; the Multi-scale Spatial Adaptive Aggregation Mechanism (MS-AA) ensures accurate detection of defects at different scales, covering the full-scale requirements from micro-scratches to large damages through dynamic receptive field adjustment. Starting from core pain points, these modules provide key support for improving detection performance. Experimental results show that on the self-constructed Pack-Defect dataset, PackNet-MF achieves an F1-score of 0.9012, which represents a significant improvement over baseline methods such as U-Net. It also exhibits excellent performance in the localization accuracy of tiny defects, with a mean Intersection over Union (mIoU) of 0.8473. Furthermore, transfer experiments on the public NEU-Seg dataset further verify the model’s strong generalization ability.

Bookmark

Multi-source information fusion in packaging appearance defect detection

Key Points

Abstract

Cite This Study