Unsupervised Feature Learning for Object Detection in Low-Light Surveillance Footage
Keywords:
Unsupervised learning, object detection, low-light surveillance, contrastive learning, pseudo-labeling, deep learning, self-supervision, feature extraction, image enhancement, nighttime visionAbstract
Detecting an object in lacking light environment is a challenge because of low quality images, eg. lacking visibility, low contrast, high amount of sensor noise, and loss of object texture and its boundaries. Degradation of input data such as lighting and camera results in severe performance degradation of the traditional object detection models, especially those that depend on large scale annotated data that was mostly acquired under ideal lighting conditions. Further, manual annotation of nighttime surveillance data is laborious, and results in uneven training data for most supervised models. In this thesis we propose a new unsupervised feature learning framework for low light surveillance videos for object detection. Illumination invariant and semantically rich features are extracted from unlabeled images by an approach that integrates contrastive self supervised learning with domain specific data augmentation. To improve feature discrimination, we use a dual branch encode_decode architecture, and to guide a detection head to learn to localize objects without explicit supervision, we propose to use a clustering based pseudo labelling strategy. A self distillation process further refines the entire framework by penalizing inconsistency with predictions and generalizing better. On top of that, the proposed method achieves better mean average precision and F1 score than conventional supervised and semi-supervised baselines based on experiments on publicly available low light datasets including ExDark and LLVIP, and is also proven to be robust to occlusion, noise, and extreme darkness. The results herein demonstrate that the proposed unsupervised framework is effective for annotaion free object detection on a large scale in real life low light surveillance purposes.