Accurately predicting future events under novel environmental conditions is a central challenge in modeling, especially when no validation data are available. While model transferability is often discussed through the concept of a "forecast horizon," we expand this framework by introducing the concept of "validity domains." These consider not only the extrapolation distance from the calibration data but also the absolute position of calibration and application conditions along an environmental gradient. Using phenological observations from Japanese Yoshino cherry (Prunus × yedoensis) across a climate gradient in Japan, we calibrated process-based and machine learning models for each of 48 locations and validated them with data from all other locations. Interpolating model performance metrics yielded a continuous synthetic surface of predictive accuracy across the full observed temperature range, from which we delineated model-specific validity domains and assessed how transferability depends on both model type and calibration environment. Our findings show that process-based models retain broader validity when calibrated in colder environments but degrade in warmer settings. In contrast, machine learning models exhibit narrower but more consistent validity across the gradient. These systematic differences reveal that the location of calibration and the structure of the model fundamentally shape its reliability under new conditions. By identifying where prediction errors remain below a context-specific validity threshold, our approach provides a robust framework for assessing model applicability under shifting climate conditions. Mapping validity domains offers practical guidance for model selection and allows quantifying how far models can be pushed before their predictions become unreliable.
Bauer et al. (Sun,) studied this question.