Zero-Shot Learning for Remote Sensing Image Segmentation Using Cross-Domain Transfer and Self-Training

Authors

  • Freddi Noria Robotics and Automation Laboratory Universidad Privada Boliviana Cochabamba, Bolivia Author
  • Libson Matharine School of Information, Systems and Modelling, University of Technology Sydney, Ultimo, NSW 2007, Australia Author

DOI:

https://doi.org/10.17051/NJSIP/01.02.07

Keywords:

Zero-shot learning, remote sensing imagery, semantic segmentation, cross-domain transfer learning, self-training, unsupervised domain adaptation, pseudo-label refinement.

Abstract

Segmentation of remote sensing (RS) images plays a pivotal role in signal and image processing as it helps in performing pixel-level interpretation of these images to help land cover mapping, environmental monitoring, studying urban infrastructure, and disaster assessment, etc. Although distilled deep learning architectures (e.g., U-Net, DeepLab, SegFormer) have produced good outcomes, their usage of high annotations (particularly with pixel-level) large-scale datasets conditions them to lack generalizability. The work presents a Zero-Shot Learning (ZSL) structure-the Cross-Domain Transfer and Self-Training (CDT-ST) model-to perform RS image segmentation without target-domain labelled information. It combines a domain-invariant feature extraction module based on signal processing, with cross-domain class-mapping based on semantic embedding, and a sequence of refinements to pseudo-labels, namely confidence thresholding, spatial consistency filtering, and Conditional Random Fields (CRFs). The combination of these techniques makes such adaptation strong when dealing with extreme changes in the domain of spatial resolution, illumination, and scene structure. The experiments include evaluating on SpaceNet, DeepGlobe, and LoveDA datasets where the mean Intersection over Union (mIoU) was found to be 87.2% with no target-domain labels, only 233 fewer than fully supervised counterparts. Combining transfer learning, semantic mapping and primitive signal/image processing methods, CDT-ST proposes a scalable, no annotation required, high-accuracy system with application of large-scale, heterogeneous RS segmentation.

Additional Files

Published

2025-02-10

Issue

Section

Articles