Guava (Psidium guajava L.) is an economically important tropical fruit valued for its nutritional and therapeutic properties. Despite its potential, progress in guava breeding has been limited by narrow genetic variability, high heterozygosity, and complex trait inheritance. A comprehensive understanding of phenotypic and biochemical diversity is therefore essential to enhance breeding efficiency and trait selection. A total of 126 guava genotypes were evaluated for 35 morphological, biochemical, and yield-related traits over two years (2023–2024). Data were analyzed using a combination of Analysis of Variance (ANOVA), multivariate statistics (PCA, MDS, t-SNE, hierarchical clustering), and machine learning (Random Forest) to assess genetic diversity, trait associations, and genotype differentiation. Population structure and model stability were validated using Mahalanobis distance, bootstrap resampling, silhouette analysis, and cross-validation procedures. Significant variability was recorded for key morphological and biochemical traits, including fruit weight, pulp thickness, total soluble solids (TSS), ascorbic acid, total phenolics, flavonoids, and total antioxidant activity (TAA). PCA revealed that the first three components explained 70.9% of total variation, with fruit weight, fruit length, and ascorbic acid being major contributors. Hierarchical clustering grouped the genotypes into four distinct clusters, indicating broad genetic diversity, while t-SNE and MDS analyses corroborated the pattern of moderate population stratification. Random Forest modeling identified Nspf, Hsw, Ll, Flav, Asc, and TAA as key discriminatory traits, validated through Out-of-Bag (OOB) accuracy and five-fold cross-validation. The integration of multivariate and machine learning approaches provided a robust framework for identifying trait architecture and selection potential in guava. High-ranking morphological and biochemical traits-particularly those related to yield and antioxidant properties-serve as reliable indicators for ideotype design and genomic-assisted breeding. The identified superior genotypes (e.g., H-2/20, CHG28, H15B/9) offer valuable material for the development of high-yielding, nutritionally enhanced, and climate-resilient cultivars.
Building similarity graph...
Analyzing shared references across papers
Loading...
Daya Shankar Mishra
Vikas Yadav
M. K. Berwal
BMC Plant Biology
Indian Agricultural Research Institute
Indian Council of Agricultural Research
Chaudhary Charan Singh Haryana Agricultural University
Building similarity graph...
Analyzing shared references across papers
Loading...
Mishra et al. (Wed,) studied this question.
www.synapsesocial.com/papers/69a76007c6e9836116a2c726 — DOI: https://doi.org/10.1186/s12870-026-08216-3