What question did this study set out to answer?

The aim is to predict cotton yield using machine learning models with Sentinel-2 imagery, evaluating model generalizability at various scales.

March 18, 2026Open Access

Multi-Scale Cotton Yield Prediction Using Machine Learning and Sentinel-2 Imagery

Key Points

The aim is to predict cotton yield using machine learning models with Sentinel-2 imagery, evaluating model generalizability at various scales.
Applied machine learning models to predict cotton yield using time-series Sentinel-2 imagery.
Evaluated model performance across different spatial resolutions (10 m, 20 m, 60 m, 100 m).
Conducted independent-year validation for generalizability at sub-field and field scales.
Utilized high-resolution yield monitor data from irrigated and rainfed fields.
Models showed R² values between 0.79 and 0.91 for Multiple Linear Regression and 0.86 to 0.97 for Exponential Gaussian Process Regression.
Decline in performance observed during independent-year validation (R² = 0.63–0.85).
Model performance improved at sub-field resolution from 10 m to 100 m.
Monthly median NDVI outperforming four-band reflectance in field scale predictions.

Abstract

• Machine learning models were used to predict cotton lint yield using Sentinel-2 imagery. • Yield prediction models were evaluated for generalizability at multiple spatial and temporal scales. • NDVI-based models achieved stable and robust performance across different scales. • CNN-LSTM and EGPR effectively captured non-linear yield–vegetation relationships. • Optimal NDVI resolution (20 m–60 m) improved accuracy while reducing computation time. Accurate yield prediction at various spatial and temporal scales is essential for optimizing crop production and management strategies. This study applied machine learning models to predict cotton yield using time-series Sentinel-2 imagery and assessed the generalizability and effects of resolution at the sub-field and field scales across multiple seasons in the Texas High Plains. High-resolution yield monitor data were collected from irrigated and rainfed fields in three counties during 2022–2024 for model training and testing. Sentinel-2 imagery and yield monitor data at 10 m resolution were resampled to 20 m, 60 m, and 100 m scales to examine the effects of spatial resolution on model performance. Model generalizability was evaluated using independent-year validation at the sub-field (2022-2023) and field scales (2017 and 2022). Machine learning performed well on all data at the 80/20 split (R 2 =0.79–0.91 for Multiple Linear Regression and 0.86–0.97 for Exponential Gaussian Process Regression). However, with independent-year validation, their performance declined (R2=0.63–0.85), highlighting the importance of testing model generalizability across different years and climate conditions. At the sub-field scale, model performance generally improved as the spatial resolution of input data decreased from 10 m (R 2 =0.63–0.79) to 100 m (R 2 =0.72–0.87). In contrast, performance at the field scale was relatively insensitive to predictor resolution from 10 m to 60 m. More complex models, including CNN-LSTM (Long Short-Term Memory networks) and EGPR (Exponential Gaussian Process Regression) using four-band reflectance, slightly outperformed NDVI input data at the sub-field scale. However, monthly median NDVI (R² = 0.65–0.75) provided effective predictions at the field scale, outperforming four-band reflectance (R² = 0.58–0.69). Prediction improved with resolution decreasing from 10 m to 20 m (field scale) or 60 m (sub-field scale) while reducing computational time. Overall, CNN-LSTM and EGPR using monthly median NDVI demonstrated high performance in capturing non-linear interactions and predicting cotton yield. This study reveals the importance of selecting appropriate predictors and validating models across independent years for crop yield prediction at different scales when using machine learning models.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Sanai Li

Wenxuan Guo

Journals

Smart Agricultural Technology

Actions

Institutions

Texas Tech University

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Multi-Scale Cotton Yield Prediction Using Machine Learning and Sentinel-2 Imagery

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study