• Machine learning models were used to predict cotton lint yield using Sentinel-2 imagery. • Yield prediction models were evaluated for generalizability at multiple spatial and temporal scales. • NDVI-based models achieved stable and robust performance across different scales. • CNN-LSTM and EGPR effectively captured non-linear yield–vegetation relationships. • Optimal NDVI resolution (20 m–60 m) improved accuracy while reducing computation time. Accurate yield prediction at various spatial and temporal scales is essential for optimizing crop production and management strategies. This study applied machine learning models to predict cotton yield using time-series Sentinel-2 imagery and assessed the generalizability and effects of resolution at the sub-field and field scales across multiple seasons in the Texas High Plains. High-resolution yield monitor data were collected from irrigated and rainfed fields in three counties during 2022–2024 for model training and testing. Sentinel-2 imagery and yield monitor data at 10 m resolution were resampled to 20 m, 60 m, and 100 m scales to examine the effects of spatial resolution on model performance. Model generalizability was evaluated using independent-year validation at the sub-field (2022-2023) and field scales (2017 and 2022). Machine learning performed well on all data at the 80/20 split (R 2 =0.79–0.91 for Multiple Linear Regression and 0.86–0.97 for Exponential Gaussian Process Regression). However, with independent-year validation, their performance declined (R2=0.63–0.85), highlighting the importance of testing model generalizability across different years and climate conditions. At the sub-field scale, model performance generally improved as the spatial resolution of input data decreased from 10 m (R 2 =0.63–0.79) to 100 m (R 2 =0.72–0.87). In contrast, performance at the field scale was relatively insensitive to predictor resolution from 10 m to 60 m. More complex models, including CNN-LSTM (Long Short-Term Memory networks) and EGPR (Exponential Gaussian Process Regression) using four-band reflectance, slightly outperformed NDVI input data at the sub-field scale. However, monthly median NDVI (R² = 0.65–0.75) provided effective predictions at the field scale, outperforming four-band reflectance (R² = 0.58–0.69). Prediction improved with resolution decreasing from 10 m to 20 m (field scale) or 60 m (sub-field scale) while reducing computational time. Overall, CNN-LSTM and EGPR using monthly median NDVI demonstrated high performance in capturing non-linear interactions and predicting cotton yield. This study reveals the importance of selecting appropriate predictors and validating models across independent years for crop yield prediction at different scales when using machine learning models.
Building similarity graph...
Analyzing shared references across papers
Loading...
Sanai Li
Wenxuan Guo
Smart Agricultural Technology
Texas Tech University
Building similarity graph...
Analyzing shared references across papers
Loading...
Li et al. (Sun,) studied this question.
www.synapsesocial.com/papers/69ba429c4e9516ffd37a2fcd — DOI: https://doi.org/10.1016/j.atech.2026.101999