The advancement of sequencing technologies has generated an unprecedented volume of single-cell multi-omics data, providing new opportunities for biological discovery and medical research. However, due to the high heterogeneity across different omics types, effective integration of single-cell multi-omics data remains a critical challenge. Existing methods generally ignore the graph structure information among cells or resort to additional knowledge to construct the cell graphs, leading to suboptimal performance and potentially limited practical utility. In this study, we propose a novel Graph-embedded Deep Generative Clustering model (GeDGC) for single-cell multi-omics data integration. Specifically, GeDGC simultaneously learns the shared latent representations and cluster factors across multiple omics by leveraging Gaussian mixture models. Moreover, we impose the graph embedding constraint on both the latent representations and the cluster assignments to ensure the preservation of intrinsic local data structure among cells. As a result, our model captures complex correlations across omics and obtains informative shared latent embeddings for downstream tasks. Extensive experimental results with seventeen competing methods on ten datasets confirm the superiority of GeDGC in single-cell multi-omics data integration.
Liang et al. (Thu,) studied this question.