Environmental Sound Classification Using CNNs with Frequency-Attentive Acoustic Modeling

Authors

  • Dahlan Abdullah Department of Informatics, Faculty of Engineering, Universitas Malikussaleh, Aceh, Indonesia. Author

Keywords:

Environmental sound classification, convolutional neural network, spectral attention, acoustic modeling, log-mel spectrogram, UrbanSound8K, ESC-50, frequency-aware learning

Abstract

With urban monitoring, smart surveillance, and context aware mobile computing being just some real world applications, environmental sound classification is an essential problem to solve. But recognizing diverse sound events under multiple noise and acoustic conditions accurately is an open problem. In this paper, we propose a novel convolutional neural network (CNN) framework with frequency-attentive acoustic modeling to boost classification accuracy and robustness in the noisy environments. We present an approach that proposes a spectral attention module in order to highlight discriminative frequency bands in the logmel spectrogram, so that the network could pay attention to informative spectrogram patterns that are unique to certain sound classes. Results on ESC-50 and UrbanSound8K datasets further demonstrate that the proposed model has state-of-theart performance in terms of classification accuracy of 89.4% and 87.1% respectively, outperforming several existing CNNbased baselines. Ablation studies show that the role of the attention mechanism is responsible for noise resilience and generalization. Overall, this research provides a lightweight but powerful ESC solution at the cost of accuracy to maintain computational efficiency such that it is viable for deployment to edge audio recognition systems.

Downloads

Published

2025-03-21

Issue

Section

Articles