Harmonic Feature Extraction and Deep Fusion Networks for Music Genre Classification

Authors

  • Saravanakumar Veerappan Director, Centivens Institute of Innovative Research, Coimbatore, Tamil Nadu, India Author

Keywords:

Music genre classification, harmonic features, deep fusion networks, CNN, Bi-GRU, Mel-spectrogram, HPSS, MFCC, attention mechanism

Abstract

Music Genre Classification is an important job in the music information retrieval systems from recommendation engines to digital music archive. Traditional machine learning approaches are dominated by the usage of handcrafted features which are often inadequate to the complex hierarchical structure of music. This paper presents an innovative hybrid framework for the extraction of harmonic features using a deep fusion network architecture for genre identification of music and accurate and robust classification. The method that is being proposed first extracts HPSS based features,.statistics CENS ; Mel spectrograms and MF CCs to capture the low level and harmonic content. These are then fused with a dual-branch deep neural network fusing Convolutional Neural Networks (CNN) for spatial features extraction and Bidirectional Gated Recurrent Units (Bi-GRUs) for temporal sequence modeling. The fusion module fuses the both learned representations with an attention based mechanism. Based on experiment conducted on the GTZAN and FMA dataset, it is shown that our proposed framework outperforms several state-of-the-art models with classification accuracies of 93.6% and 89.1% respectively. This work proves the efficiency of harmonic-aware deep fusion networks in representing both spectral and temporal dynamics for music genre classification.

Downloads

Published

2025-03-17

Issue

Section

Articles