ABSTRACT Metagenomic profiling has advanced the understanding of microbe-host interactions. However, widely used read-based approaches are limited by incomplete reference databases and the inability to resolve strain-level variation. Here, we present a scalable, genome-resolved framework that integrates population-specific metagenome-assembled genomes (MAGs) to discover novel species, within-species diversity, and disease associations. From 1,878 deeply sequenced samples in the Estonian Microbiome Cohort (EstMB-deep), we reconstructed 84,762 MAGs representing 2,257 species, including 353 (15.6%) previously uncharacterized species reaching up to 30% relative abundances in some individuals. We integrated these MAGs with the Unified Human Gastrointestinal Genome collection to create an expanded reference (GUTrep), enabling profiling of 2,509 EstMB individuals and testing associations with 33 prevalent diseases. Of the 25 diseases with significant associations, 8 involved newly identified species, underscoring the value of population-specific MAGs. To quantify within-species diversity, we developed the genome unit number (GUN), a novel MAG-based metric that informed within-species analyses. Based on normalized GUN, we prioritized Odoribacter splanchnicus, a prevalent species with the lowest within-species heterogeneity, yielding sufficient power for a within-species association study. We identified two dominant genome units, GU-N1 and GU-N2, with distinct gene repertoires and divergent disease associations. Notably, GU-N1 was negatively associated with gastritis, duodenitis, and hypertensive heart disease, associations undetected at the species level. Our study expands the human gut reference landscape, demonstrates the importance of population-specific MAGs for uncovering novel microbial diversity, and reveals new disease associations at the within-species level obscured at higher taxonomic levels, highlighting the need for genome-resolved approaches in microbiome research. IMPORTANCE Microbiome studies increasingly recognize that species-level profiles can mask critical within-species differences relevant to health and disease. However, our work shows that within-species diversity varies drastically across gut microbes, with some species exhibiting almost as many distinct within-species clusters as recovered genomes, making association studies at the within-species level essentially intractable. To address this, we introduce the genome unit number (GUN), a scalable metric for quantifying within-species structure. Using GUN, we demonstrate that only species with limited within-species diversity, such as Odoribacter splanchnicus , currently allow for robust within-species association testing. These findings emphasize the need to systematically evaluate species structure across the gut microbiome and call for the development of new computational and statistical approaches to enable meaningful within-species analyses in highly diverse species.
Building similarity graph...
Analyzing shared references across papers
Loading...
Pantiukh et al. (Mon,) studied this question.
www.synapsesocial.com/papers/69ba424e4e9516ffd37a2640 — DOI: https://doi.org/10.1128/msystems.00114-26
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:
Kateryna Pantiukh
Kertu Liis Krigul
Oliver Aasmets
mSystems
University of Tartu
Building similarity graph...
Analyzing shared references across papers
Loading...