Plastic greenhouses (PGs) are important infrastructure for modern protected agriculture, and their accurate high-resolution mapping is essential for agricultural planning and environmental protection. However, most existing approaches rely on fully supervised semantic segmentation (FSSS), which requires dense pixel-level annotations and hinders large-scale deployment. Image-level weakly supervised semantic segmentation (WSSS) offers a more scalable alternative but suffers from poor-quality pseudo-labels in initial generation stage and unreliable supervision for final training. Specifically, during pseudo-labels generation, coarse class activation maps (CAMs) with overly concentrated high-response areas and insufficient multi-scale details, together with conventional CAMs-to-label conversion methods, struggle to recover accurate boundaries and preserve object completeness. Moreover, inevitable noise in the generated initial pseudo-labels may mislead model optimization and degrade generalization performance. To address these challenges, we propose a novel image-level WSSS framework termed SAM-assisted Semantic Segmentation (SASS) for large-scale PGs mapping from high-resolution images using only image-level labels. SASS integrates multi-level high-resolution CAMs (MHCAMs) with regional priors derived from the Segment Anything Model (SAM) to generate more complete and boundary-aware pseudo-labels. Besides, a noise suppression strategy with aid of region filtering is further developed to enhance label reliability for final segmentation model training. Finally, a localization-then-extraction strategy is adopted to efficiently map PGs distribution (2.4 m) in Shandong Province, China, achieving overall accuracy and F1 score exceeding 90%. Extensive experiments and comparisons demonstrate that SASS exhibits advantages in accuracy and efficiency and can provide a scalable and transferable solution for large-scale mapping applications. • Large-scale PGs mapping on HRRSI using only low-cost image-level label. • Improve pseudo-label quality combining MHCAMs and SAM-derived regional priors. • Suppress noise via noisy label correction and regional filtering. • Boost mapping efficiency via Localization-then-extraction strategy.
Zhang et al. (Fri,) studied this question.