February 19, 2024Open Access

Parallel GEMM-based convolution for deep learning on multicore RISC-V processors

Key Points

Key points are not available for this paper at this time.

Abstract

Abstract We address the efficient implementation of the convolution operator on the GAP8 parallel ultra-low power platform (PULP), a heterogeneous multi-core processor equipped with a fabric controller (FC); a cluster of eight compute cores; and a four-level memory hierarchy with scratchpads instead of conventional, hardware-assisted cache memories. Our solution for this platform transforms the convolution into a general matrix–matrix multiplication ( gemm ) via the lowering approach, demonstrating that it is possible to attain reasonable performance on the GAP8 by carefully adapting techniques such as tiling and loop parallelism, which are mainstream in the multi-threaded, cache-aware realization of gemm .

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Ramírez et al. (Mon,) studied this question.

www.synapsesocial.com/papers/68e78968b6db6435876fbe3c — DOI: https://doi.org/10.1007/s11227-024-05927-y

Also consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Performance Analysis of Matrix Multiplication for Deep Learning on the Edge· 2022 · 6 citations
A BLIS-like matrix multiplication for machine learning in the RISC-V ISA-based GAP8 processor· 2022 · 12 citations
Mr.Wolf: An Energy-Precision Scalable Parallel Ultra Low Power SoC for IoT Edge Processing· 2019 · 156 citations
Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective· 2018 · 586 citations
A Family of High-Performance Matrix Multiplication Algorithms· 2006 · 55 citations

Authors

Cristián Ramírez

Adrián Castelló

Héctor Martínez

Journals

The Journal of Supercomputing

Actions

Institutions

Universitat Politècnica de València

University of Córdoba

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Parallel GEMM-based convolution for deep learning on multicore RISC-V processors

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Also consider

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion