March 3, 2026Open Access

Should you use data integration for your distribution model?

Key Points

Data integration can enhance species distribution modelling outcomes, particularly when data quality is high.
Simulations reveal that model performance varies based on data volume and dataset compatibility in joint likelihood models.
Analysts should weigh practical costs against potential benefits before adopting data integration approaches.
Concordance tests do not reliably predict when joint modelling will outperform simpler methods.

Abstract

Data integration-the analysis of two or more observational datasets in a single statistical model-is on the rise in species distribution modelling. Recent papers showcase the usefulness of data integration, but few highlight cases where data integration produces equal or worse outcomes compared to single-dataset modelling. Here, we offer a decision-making framework to assess whether data integration may provide improvements over simpler modelling approaches. We focus on joint likelihood data integration, in which two or more datasets are linked to a single shared process model. We highlight three considerations for analysts deciding whether to use data integration: (1) the practical costs associated with developing and validating an integrated model; (2) the marginal benefits to model performance, which vary depending on data volume and coverage; and (3) the concordance (or compatibility) of the two datasets. Using a simulation study, we illustrate modelling outcomes under a variety of conditions of data volume and bias, showing consistent patterns across three distinct formulations of joint likelihood models. We explore a priori and a posteriori tests of data concordance, but we find that such tests fail to usefully differentiate between cases where joint modelling produces better or worse outcomes. Ultimately, we outline a decision-making workflow and illustrate its application to the joint modelling of real data.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Goldstein et al. (Wed,) studied this question.

www.synapsesocial.com/papers/69a75bb7c6e9836116a23911 — DOI: https://doi.org/10.1111/1365-2656.70210

Authors

Benjamin R. Goldstein

Jeffrey Doser

Brent S. Pease

Journals

Journal of Animal Ecology

Actions

Institutions

North Carolina State University

Southern Illinois University Carbondale

North Carolina Museum of Natural Sciences

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Should you use data integration for your distribution model?

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion