Reference-based Lattice Transfer Embedding Analysis (RefLaTEA) is a dimensionality reduction and interpretation framework designed to contextualize small experimental datasets within the structure of large-scale environmental or observational data. The method constructs a reference embedding (‘lattice”) based on uniform manifold approximation and projection (UMAP) from a background dataset. Into this lattice, additional datasets of interest can be projected through transfer embedding, allowing researchers to evaluate whether experimental responses fall within or outside the range of natural variation. RefLaTEA is especially suited to environmental and omics studies, in which the field data are large but noisy while experimental data are clean but sparse. We demonstrate that RefLaTEA is robust to moderate changes in UMAP parameters (neighborhood size, minimum distance) and remains interpretable even when responses are subtle. The core RefLaTEA pipeline can be optionally extended to include clustering (e.g., Hierarchical Density-Based Spatial Clustering of Applications with Noise), feature importance analysis (e.g., using random forest), and causal inference (e.g., using Bayesian networks). These downstream steps are not required for basic embedding but provide researchers with mechanistic insight. Overall, RefLaTEA bridges the gap between observational and experimental data, offering a robust and flexible framework for exploratory analysis and hypothesis generation. Provides a reference-based UMAP embedding framework that contextualizes small experimental datasets within large-scale observational data Enables robust interpretation of subtle responses by contextualizing datasets of interest relative to background data Offers an extensible analysis pipeline integrating transfer embedding with clustering, feature extraction, and causal inference
Shima et al. (Sun,) studied this question.