Fault-Tolerant Reconfigurable Computing Systems for High Performance Applications
DOI:
https://doi.org/10.31838/RCC/02.01.04Keywords:
Fault Tolerance;, High Performance;, Reconfigurable Computing;, Reliability;, System Resilience;, Performance OptimizationAbstract
Reconfigurable hardware architectures are transforming the realm of high
performance computing. In the face of explosive computational demands
(exponentially growing across a wide range of domains ranging from financial
modeling to bioinformatics) the need for robust and adaptable computing
systems is never higher. In this paper, it will discuss the design of some of
the fault tolerant reconfigurable computing systems, the applications, the
challenges and the new ideas in the cutting edge solution in high perfor
mance computation. Field Programmable Gate Arrays (FPGAs) based recon
f
igurable computing systems present a unique combination of flexibility and
performance that is difficult to equal in traditional computing systems. These
systems enable, by allowing hardware configuration to be changed dynami
cally, to obtain optimization to the task at hand resulting in performance and
energy efficiency gains. While these systems are being deployed to mission
critical applications, reliability and fault tolerance must be guaranteed. In
this article, we conduct a thorough study of the core concepts related to the
reconfigurable systems, their fault tolerant designs, innovative architectures
developed to overcome the reliability problems, and the real world appli
cation where these systems are strongly influencing. This article seeks to
explore a detailed perspective encompassing FPGA based fault detection and
masking techniques, as well as the implications of the fault tolerant reconfig
urable computing for high performance computing landscapes, from current
state to future directions.