Energy-Efficient FPGA Accelerator Architecture for Real-Time Convolutional Neural Network Inference

Madhanraj

Authors

Madhanraj Jr Researcher, Advanced Scientific Research, Salem Author

Keywords:

FPGA, CNN accelerator, energy efficiency, real-time inference, hardware acceleration, edge AI, quantization, systolic array

Abstract

The option of real-time Convolutional Neural Network (CNN) inference has become a critical condition within the edge computing application, such as autonomous navigation, medical diagnosis, industrial automation, and intelligent surveillance system. Nevertheless, it is not an easy task to implement deep CNN models on resource-limited system environments because of the high computational complexity, large memory bandwidth usage, and tight limitations on power consumption. Field-Programmable Gate Arrays (FPGAs) have a bright future characterised by performance, flexibility, and energy efficiency, but current FPGA-based accelerators are also limited with regards to data reuse, fixed precision computation and suboptimal memory hierarchies. The current paper introduces a new energy-saving FPGA accelerator architecture, which is aimed at real-time CNN inference. It was proposed that the theoretically designed tiled dataflow interface (using adaptive tiling) enables modifications in quantization-aware computation and dynamic scaling between mixed precision in order to minimize switching activity in digital signal processing (DSP) utilization. There is a hierarchy of on-chip memory, including the use of double buffering, to reduce expensive off-chip accesses to the DRAM, and a power-conscience scheduling system which dynamically varies the degree of parallelism and clock frequency at any given level of workload intensity. The architecture is capable of working in INT4, INT8 and FP16 modes of operation so that an accurate-energy balance can be tuned. Experimental analysis with real numbers on typical CNN models has shown a strong enhancement in throughput / per watt and reduction in latency over traditional fixed precision FPGA accelerators, with only a small amount of accuracy losses. Its findings validate that to capitalize on scalable, high-performance, and energy efficient CNN inference in the future edge AI systems, coordinated optimization of calculation, memory movement and precision adaptation is crucial.

Energy-Efficient FPGA Accelerator Architecture for Real-Time Convolutional Neural Network Inference

Authors

Keywords:

Abstract

Downloads

Published

Issue

Section

How to Cite

Cover Page

Make a Submission

Latest publications

Information