Abstract Food loss and waste present formidable challenges to global food security and sustainability, exacerbated by inaccurate shelf-life predictions for perishable products like marinated meats. This study introduces a novel in silico framework that integrates machine learning (ML) and text mining (TM) to enhance the precision of shelf-life estimations for these products. Initially, a diverse dataset was assembled by merging TM techniques with manual literature reviews. Evaluations of 16 encoding methods and 9 ML algorithms pinpointed the combination of ‘leaveoneout’ encoding and the RandomForest algorithm as the most effective, achieving accuracy of 98%, F1-score of 0.97, and Matthews Correlation Coefficient (MCC) of 0.96. Feature importance analysis, SHAP (SHapley Additive Explanations) values, and partial dependence plots not only highlighted the significant roles of preservatives (scoring 0.381), sterilization techniques (scoring 0.298), temperature control (scoring 0.197), and packaging (scoring 0.124), but also explored potential interactions among these factors and their additive or synergistic effects on shelf life. Additionally, a user-friendly graphical user interface (GUI) was developed to facilitate data input, shelf-life prediction, and model retraining with user-submitted data. This integrative approach provides a rapid, cost-effective, and accessible solution for predicting the shelf life of marinated meat products, potentially extendable to a broader range of products, ultimately contributing to reduced food waste and enhanced sustainability.
Zhang et al. (Tue,) studied this question.