Emotion Classification Using Nature Based Optimization With Transformers And Transfer Learning
This paper presents a methodology that utilizes machine learning models like ResNet50, BERT, and GPT2 for the classification of emotion in text or audio format. Proposed research had included the study of Hindi dataset MER500 and English dataset DEAM which discrete 5 classes of emotion. Models had attained about 85% and 83% for MER500 and DEAP datasets respectively. Ensembled model had also been showcased which had provided the best performance in MER500 dataset whereas BERT for DEAM dataset respectively. Comparative analysis had also showcased about lyrics as a best available form of data to classify the emotion from it. BERT and GPT -2 models are used for the classification of the lyrics. And out of these 2 models BERT had performed best for the 2 datasets. And this shows that, for these 2 dataset emotions are prominent in the lyrics (text features) as compared to audio features. And from the audio, a song and its karaoke are classified using the ResNet50 model. ResNet50 despite pretrained on the ImageNet dataset, transfer learning helped significantly for learning features from the audio. To classify the audio signal, various features were extracted and assembled in multi-dimensional array.