As a cornerstone of knowledge discovery in metabolomics, multivariate analysis enables the evaluation of relationships between variables, such as measured signals, and observed objects or samples, thereby facilitating the deciphering and deeper understanding of the processes under study. Matrix factorization methods are extensively employed to uncover trends and relevant groupings of observations, but also to highlight potentially related variables based on their contributions to model components. However, the inherent high dimensionality of metabolomic datasets raises questions about the reliability of the coefficients obtained and efficient solutions for this purpose are needed. A novel method is proposed to assess the stability of Variable Importance in Projection from Partial Least Squares regression models, a common criterion widely used in metabolomics to highlight relevant subsets of variables. It combines bootstrap resampling and permutations to offer an effective and versatile tool based on a stability index and a diagnostic plot. The proposed strategy leverages the full set of variable importance values collected across bootstrap replicates to construct empirical distributions, both authentic and permuted, thereby enhancing the robustness of the assessment. Results from a synthetic dataset and representative real case studies illustrate its potential for evaluating the reliability of meaningful variables, and remove uninformative signals in metabolomic data resulting from different experimental configurations. A comparison benchmark with established approaches highlights its merits, emphasizing its ability to provide more stable subsets of informative variables, improving the interpretability of metabolomics studies. Because it is computationally efficient and does not require assumptions about data distribution, the proposed method constitutes a straightforward, generic and relevant approach that is well suited to the needs of a wide range of applications. The broad adoption of this type of methodology will undoubtedly help to achieve more consistent and reproducible results, ultimately advancing the understanding of metabolic patterns. • The reliability of Variable Importance in Projection must be evaluated. • A novel method combining bootstrap resampling and permutations is proposed. • A stability index and a diagnostic plot are used for the assessment. • A comparison benchmark with established approaches highlights its advantages. • Results demonstrate that the method constitutes an effective and versatile tool.
Boccard et al. (Sat,) studied this question.