Please use this identifier to cite or link to this item:
http://nopr.niscpr.res.in/handle/123456789/65578| metadata.dc.identifier.doi: | https://doi.org/10.56042/jsir.v84i03.13751 |
| Title: | DUALBIGRU-UCSA: Deep Learning based Music Emotion Recognition Model |
| Authors: | Man, Szeto Chung Kumar, Alok Tiwari, Ajay Srivastava, Prateek Verma, Deepak Kumar Mamoria, Pushpa Singh, Vineeta Kumar, Chandra Shekhar Seth, Amit Joshi, Kapil Kaushik, Vandana Dixit |
| Keywords: | Dual bidirectional gated recurrent unit;Mel frequency cepstral coefficients;Music emotion recognition;Unified contextual shuffle attention fusion;Weighted categorical cross-entropy |
| Issue Date: | Mar-2025 |
| Publisher: | NIScPR-CSIR, India |
| Abstract: | Music Emotion Recognition (MER) is a process to classify emotions perceived in a given piece of music with computational models. There are several problems regarding existing models, due to subjective perception of emotions and individual differences and culture diversity. To overcome these challenges, we developed a Dual Bidirectional Gated Recurrent Unit with Unified Contextual Shuffle Attention Fusion (DualBiGRU-UCSA) model. Here, the primary contribution lies in the practical implementation of bidirectional and gated recurrent units along with developed attention mechanisms to address the requirements for understanding and perceiving complex musical features. Using Bidirectional GRUs, the model taps the information of past and future contexts of music sequences in addition to refining the features of temporal dynamics and feelings. The final model’s performance enhancements involve the integration of bidirectional GRU outputs to the UCSA module through paying much attention and shifting through the feature representations, the module consisting of Shuffle Attention and Multi-Head Location-Aware Attention performs by reducing the unimportant feature representations while enhancing the important patterns and contextual cues. The proposed model performs better in terms of high accuracy, f1-score, negative predictive values, positive predictive values and recall of 96.28%, 96.32%, 96.26%, 96.60% and 96.27% respectively as compared to recent State-of-the-Art techniques. |
| Page(s): | 308-323 |
| ISSN: | 0022-4456 (Print); 0975-1084 (Online) |
| Appears in Collections: | JSIR Vol.84(03) [March 2025] |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| JSIR 84(3) 308-323.pdf | 7.68 MB | Adobe PDF | View/Open |
Items in NOPR are protected by copyright, with all rights reserved, unless otherwise indicated.