What question did this study set out to answer?

The aim is to enhance energy-efficient inference for deep neural networks by adjusting sparsity thresholds.

February 26, 2026Open Access

S-TRAC: An Algorithm–Hardware Co-design of Sparsity-aware Threshold Adjustment for Accelerator-based RISC-V ISA Extensions

Key Points

The aim is to enhance energy-efficient inference for deep neural networks by adjusting sparsity thresholds.
Developed a new accelerator called S-TRAC.
Utilized a static sparse-dense storage format and a dynamic bit-processing scheme.
Introduced a column-wise processing-element array with LUT-based shift-accumulate multiplication.
Designed a RISC-V extension for end-to-end accelerator execution.
Achieved an average increase in effective sparsity by a factor of 8.4.
Improved energy efficiency by 11.16 times over existing solutions.
Enhanced hardware efficiency by 37.03 times when compared to state-of-the-art designs.

Abstract

Deep neural networks (DNNs) have become foundational to modern applications, yet their substantial computational and memory demands pose major obstacles to energy-efficient inference. Moreover, the rapidly expanding parameter footprint and structural diversity further amplify data movement, leading to substantial energy consumption and latency overheads. To address these issues, we propose a novel accelerator, S-TRAC , that dynamically adjusts sparsity t h r esholds through a lgorithm-hardware c o-design to enable efficient DNN inference. At the algorithm level, we employ a static sparse-dense storage format and a dynamic bit-processing scheme to skip non-contributing bits without sacrificing weight precision. At the hardware level, we introduce a column-wise processing-element array with LUT-based shift-accumulate multiplication and a global partial-sum accumulator to sustain energy-efficient execution. To support the proposed algorithm-hardware co-design, we propose a RISC-V extension that coordinates the read, arrangement, multiplication, accumulation, and write stages to support end-to-end accelerator execution. Experimental results show that S-TRAC increases effective sparsity by an average factor of 8.4 × across the evaluated DNN models, enabling substantial memory savings. S-TRAC design achieves 11.16 × energy efficiency and 37.03 × hardware efficiency improvements over state-of-the-art solutions.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper