Abstract Deep learning models that combine convolutional neural networks (CNNs) and long short-term memory (LSTM) networks have demonstrated strong capabilities in spatiotemporal feature extraction, proving effective for applications such as ocean environment monitoring and forecasting. Specialized artificial intelligence (AI) processors are often required for marine equipment with constrained computational resources and energy budgets to handle AI workloads. However, the distinct computational and memory access patterns of CNNs and LSTMs present significant challenges for designing efficient edge AI processors; existing hardware accelerators often struggle to efficiently support the heterogeneous computational patterns, irregular dataflow, and dynamic precision requirements of such hybrid models. To address these challenges, this paper proposes a dynamically reconfigurable field-programmable gate array (FPGA)-based accelerator tailored for parallel CNN-LSTM computation. The proposed architecture integrates a mixed-precision computation array, multilevel reconfigurable processing elements, and a triple-mode dataflow controller supporting weight-stationary/output-stationary/row-stationary dataflow, thereby enabling adaptive resource allocation and enhanced data reuse under diverse computation patterns. The accelerator is designed to efficiently execute both individual and hybrid CNN-LSTM workloads. Experimental evaluation on a representative ConvLSTM-based sea surface temperature prediction task demonstrates that the proposed design achieves high throughput and energy efficiency in both convolutional and recurrent computation phases.
Wang et al. (Fri,) studied this question.