Accurate prediction of cereal yields is critical for food security, particularly in Sahelian regions characterized by high climatic variability. This study develops a machine learning framework integrating dynamic agroclimatic variables (precipitation, temperature, soil nutrients) with FAO production statistics in Senegal over 2000–2024. Feature selection based on correlation with yield indicated that MODIS-derived vegetation indices (NDVI, EVI, SAVI) were less relevant and thus excluded. Several models were evaluated, including Random Forest, XGBoost, CatBoost, and a Bidirectional LSTM explicitly designed to capture temporal dependencies. The BiLSTM achieved the highest predictive accuracy (R2 =0.94, RMSE=98.53), followed by CatBoost (R2=0.80,RMSE=216.21), XGBoost (R2=0.74, RMSE=243.07), and Random Forest (R2 = 0.72, RMSE = 251.46). Robustness was assessed using the DieboldMariano test, and interpretability was explored with SHAP values. The study demon strates that agroclimatic and production variables dominate over vegetation indices in predicting yields and highlights the trade-off between the superior accuracy of deep learning models and their higher computational cost. These results provide a reliable and interpretable framework for yield fore casting in Sahelian agriculture, emphasizing both methodological rigor and practical applicability
Gueye Pape El Hadji Abdoulaye (Sat,) studied this question.