What question did this study set out to answer?

This research aims to create a unified acceleration pipeline for deploying AI algorithms on FPGA and CGRA architectures, addressing the lack of standardization.

January 21, 2026Open Access

A Unified FPGA/CGRA Acceleration Pipeline for Time-Critical Edge AI: Case Study on Autoencoder-Based Anomaly Detection in Smart Grids

Key Points

This research aims to create a unified acceleration pipeline for deploying AI algorithms on FPGA and CGRA architectures, addressing the lack of standardization.
Developed an open-source hardware-aware AI acceleration pipeline.
Utilized Brevitas quantization framework for model optimization.
Supported backend flows including FINN and CGRA4ML for different performance needs.
Implemented a translation layer from QONNX to QKeras.
Tested pipeline using autoencoder model on a realistic cyber-physical testbed.
Achieved up to 10× faster inference per flow compared to baseline.
Demonstrated over 11× increase in energy efficiency.
Maintained acceptable reconstruction accuracy during testing.

Abstract

The ever-increasing need for energy-efficient implementation of AI algorithms has driven the research community towards the development of many hardware architectures and frameworks for AI. A lot of work has been presented around FPGAs, while more sophisticated architectures like CGRAs have also been at the center. However, AI ecosystems are isolated and fragmented, with no standardized way to compare different frameworks with detailed Power–Performance–Area (PPA) analysis. This paper bridges the gap by presenting a unified, fully open-source hardware-aware AI acceleration pipeline that enables seamless deployment of neural networks on both FPGA and CGRA architectures. Built around the Brevitas quantization framework, it supports two distinct backend flows: FINN for high-performance dataflow accelerators and CGRA4ML for low-power coarse-grained reconfigurable designs. To facilitate this, a model translation layer from QONNX to QKeras is also introduced. To demonstrate its effectiveness, we use an autoencoder model for anomaly detection in wind turbines. We deploy our accelerated models on the AMD’s ZCU104 and benchmark it against a Raspberry Pi. Evaluation on a realistic cyber–physical testbed shows that the hardware-accelerated solutions achieve substantial performance and energy-efficiency gains—up to 10× and 37× faster inference per flow and over 11× higher efficiency—while maintaining acceptable reconstruction accuracy.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Eleftherios Mylonas

Chrisanthi Filippou

Sotirios Kontraros

Journals

Electronics

Actions

Institutions

University of Patras

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

A Unified FPGA/CGRA Acceleration Pipeline for Time-Critical Edge AI: Case Study on Autoencoder-Based Anomaly Detection in Smart Grids

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study