Abstract Metagenomic binning is essential for reconstructing prokaryotic genomes from metagenomic samples. We benchmarked various binning tools using Critical Assessment of Metagenome Interpretation (CAMI)-simulated, custom-simulated, and real metagenomic datasets, primarily focusing on short-read sequencing data. Our analysis highlights critical factors influencing binning efficacy: (i) Sequencing depth and taxonomic complexity strongly impact binning performance, while CAMI-simulated benchmarking datasets exhibit substantially lower complexity than human gut and environmental metagenomes, (ii) Chimeric genome rates vary widely across tools, (iii) Multi-sample binning is most effective with about 20 samples, as using too few or too many samples can reduce its benefits, and (iv) Binning efficacy was lower for single-end sequencing samples due to reduced contig quality and assembly fragmentation. Neural network-based tools consistently outperformed others in genome recovery from both real samples and simulated samples with realistic taxonomic complexity, though at higher computational cost. By integrating and refining genome bins from the top three binning tools, we recovered >30% more high-quality genomes than previous methods. This study provides practical guidance for improving metagenomic binning to facilitate the reconstruction of prokaryotic genomes.
Building similarity graph...
Analyzing shared references across papers
Loading...
Jungyeon Kim
Nayeon Kim
Jun Hyung
Nature Communications
Yonsei University
Incheon National University
Building similarity graph...
Analyzing shared references across papers
Loading...
Kim et al. (Tue,) studied this question.
www.synapsesocial.com/papers/69e07dad2f7e8953b7cbeabb — DOI: https://doi.org/10.1038/s41467-026-71521-w