What question did this study set out to answer?

This research aims to compare BiLSTM and Xinanjiang models to determine optimal training data thresholds for flood event prediction.

March 2, 2026Open Access

Event-based training data thresholds for BiLSTM versus Xinanjiang models: Insights from the applications of 19 Chinese catchments

Key Points

This research aims to compare BiLSTM and Xinanjiang models to determine optimal training data thresholds for flood event prediction.
Evaluated BiLSTM against Xinanjiang model in 19 catchments across southern China.
Varied training data from 30% to 80% of total flood events for analysis.
Assessed model performance using Nash-Sutcliffe Efficiency (NSE) and Kling-Gupta Efficiency (KGE) metrics.
A critical training data threshold of 70–80% of flood events was found necessary for BiLSTM to perform competitively.
NSE and KGE metrics showed different performance convergence thresholds (50% vs 70% training data).
Small catchments favored BiLSTM, while larger catchments showed higher efficiency for the Xinanjiang model.

Abstract

This study is conducted in 19 diverse catchments (43–7907 km²) across the humid mountainous regions of southern China. We develop a rigorous comparative framework to evaluate a data-driven Bidirectional Long Short-Term Memory (BiLSTM) model against a traditional conceptual Xinanjiang model within a single-catchment context. By systematically varying the training data from 30% to 80% of available flood events, this study aimed to quantify the data-performance relationship, identify the critical training data threshold at which deep learning becomes competitive, and uncover the physical catchment characteristics that controlling model suitability. A critical data threshold of 70–80% of available flood events (approximately 21–24 events) is identified, below which the conceptual model is superior and above which the BiLSTM achieves competitive performance. This threshold is fundamentally controlled by catchment scale, with small-scale catchments favouring the BiLSTM and large-scale catchments maintaining an advantage with the conceptual model—a pattern reflecting how monsoon-driven flood processes manifest differently across the region's physiographic gradient. Furthermore, the Nash-Sutcliffe Efficiency (NSE) and Kling-Gupta Efficiency (KGE) metrics exhibit different convergence patterns with increasing data availability, with implications for comprehensive model evaluation in data-limited contexts. These findings culminate in an actionable, scale-informed framework for model selection that can guide provincial hydrological bureaus in transitioning from traditional to deep learning approaches. • BiLSTM needs 70–80% training data (21–24 floods) to match conceptual models. • NSE and KGE metrics show different convergence thresholds (50% vs 70% MDR). • Catchment scale dictates model error, while topography governs model efficiency. • Provided actionable model selection rules based on catchment scale (PC1 value).

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Li et al. (Sat,) studied this question.

synapsesocial.com/papers/69a52e15f1e85e5c73bf16fb https://doi.org/https://doi.org/10.1016/j.ejrh.2026.103299

Bookmark

View Full Paper