• MRNN integrates Muskingum routing into RNN with hard mass conservation constraint. • Water storage loss of MRNN is 500–3500 times less than conventional networks. • MRNN is robust with NSE drop of 2.81% when training set is reduced from 80% to 20%. Flood routing is crucial for effective flood forecasting and control, for which the Muskingum method provides a simple and robust solution with clear physical mechanisms. Recently, data-driven models have been widely applied in this field; however, their lack of underlying physical foundations raises interpretability concerns and risks of violating water mass conservation, which ultimately limits their generalizability and transferability to ungauged basins or extreme flood events. This study addresses this gap by developing a Muskingum-Recurrent Neural Network (MRNN) that integrates the Muskingum routing equations into RNN’s internal structure, with Muskingum coefficients defining the network’s weights. Unlike soft-constraint approaches, MRNN enforces mass conservation as a hard architectural constraint. We validated MRNN through three complementary stages. The first stage on artificial channels demonstrates that MRNN requires far fewer parameters than ANN, RNN, and LSTM, yet achieves faster convergence, while preserving water mass balance with orders of magnitude less water loss. The second stage on classical benchmark floods shows that MRNN achieves lower MSE than most of existing Muskingum parameter optimization methods. The third stage on floods in Village Creek shows that MRNN achieves a higher pass rate than these three networks, with peak timing errors consistently within one hour. Robustness tests reveal that when the training set is reduced from 80% to 20%, MRNN exhibits substantially smaller degradation in Nash-Sutcliffe Efficiency than these three networks. The results demonstrate that embedding physical equations into neural network architecture provides a powerful paradigm for developing interpretable, data-efficient, and physically consistent flood routing models.
Li et al. (Wed,) studied this question.