Multi-omics datasets capture complementary aspects of biological systems and are central to modern machine learning applications in biology and medicine. Existing graph-based integration methods typically construct separate graphs for each omics type and focus primarily on intra-omic relationships. As a result, they often overlook cross-omics regulatory signals-bidirectional interactions across omics layers-that are critical for modeling complex cellular processes. A second major challenge is missing or incomplete omics data; many current approaches degrade substantially in performance or exclude patients lacking one or more omics modalities. To address these limitations, we introduce MultiGEOmics, an intermediate-level graph integration framework that explicitly incorporates regulatory signals across omics types during graph representation learning and models biologically inspired omics-specific and cross-omics dependencies. MultiGEOmics learns robust cross-omics embeddings that remain reliable even when some modalities are partially missing. We evaluated MultiGEOmics across eleven datasets spanning cancer and Alzheimer's disease, under zero, moderate, and high missing-rate scenarios. MultiGEOmics consistently maintains strong predictive performance across all missing-data conditions while offering interpretability by identifying the most influential omics types and features for each prediction task.The source code and the documentation of MultiGEOmics are available at https://github.com/bozdaglab/MultiGEOmics.
Pijani et al. (Mon,) studied this question.