Low-Light Image Restoration Using Transformer-Based Enhancement Networks

Authors

  • P. Kharabi College of Applied Science, University of Technology and Applied Sciences, Ibri, Sultanate of Oman Author
  • T.G. Zengeni Dept. of Electrical Engineering, University of Zimbabwe, Harare, Zimbabwe Author

DOI:

https://doi.org/10.17051/NJSIP/01.02.04

Keywords:

Low-Light Image Restoration, Vision Transformer, Image Enhancement, Illumination-Aware Attention, Real-Time Inference, Self-Attention, Deep Learning, Image-to-Image Translation

Abstract

Restoration of low-light images is a classic amongst the challenges of computer vision, relevant to most image formation processes in the real world, including surveillance at night time, automated driving, medical diagnosis, and low-light imaging. Photos taken in dim light usually undergo severe degradation in the form of noise, loss of contrast, color distortion and detailing which makes it very difficult on the viewer to interpret as well as perform automated vision systems. Although the current image enhancement algorithms and convolutional neural network (CNN)-based models partially solve this issue, they remain limited by small receptive fields and unsuccessful modeling of global dependencies, therefore resulting in poor enhancement and artifact creation. To overcome these drawbacks, in this paper, a new low-light enhancement transformer model, LightFormer, inspired by the application of self-attention mechanisms, is introduced, which has many local textures and long-range contextual relationships in a unified framework. LightFormer has a two-branch encoder-decoder style along with Transformer bottleneck, and a new Illumination-Aware Attention Module (IAAM) that adaptively adjusts dark areas depending on what the model has learned about the illumination distributions. This better model is trained on modified loss involving the pixel wise reconstruction loss, the perceptual loss and illumination loss with the view to ensuring the quantitative accuracy and the visual plausibility. An immense number of experiments on published datasets like LOL and SID has shown that LightFormer is significantly better than the state of the art in peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and perceptual measures like NIQE and LPIPS. Along with that, the high degree of generalization in numerous lighting conditions with performance output in real-time on embedded hardware (NVIDIA Jetson AGX Xavier) means the proposed model can be deployed in settings with limited resources. This paper demonstrates that LightFormer is a stable and scalable model to enhance low-light with a tradeoff between high-quality restoration and inference time requirements in current computer vision.

Additional Files

Published

2025-02-11

Issue

Section

Articles