UPM Institutional Repository

Exploring COVID-19 vaccine sentiment: a Twitter-based analysis of text processing and machine learning approaches


Citation

Khalaf, Ban Safir and Hamdan, Hazlina and Manshor, Noridayu (2024) Exploring COVID-19 vaccine sentiment: a Twitter-based analysis of text processing and machine learning approaches. Bulletin of Electrical Engineering and Informatics, 13 (6). pp. 4522-4531. ISSN 2089-3191; eISSN: 2302-9285

Abstract

In the wake of the 2020 coronavirus disease (COVID-19) pandemic, the swift development and deployment of vaccines marked a critical juncture, necessitating an understanding of public sentiments for effective health communication and policymaking. Social media platforms, especially Twitter, have emerged as rich sources for gauging public opinion. This study harnesses the power of natural language processing (NLP) and machine learning (ML) to delve into the sentiments and trends surrounding COVID-19 vaccination, utilizing a comprehensive Twitter dataset. Traditional research primarily focuses on ML algorithms, but this study brings to the forefront the underutilized potential of NLP in data preprocessing. By employing text frequency-inverse document frequency (TF-IDF) for text processing and long short-term memory (LSTM) for classification, the research evaluates six ML techniques K-nearest neighbors (KNN), decision trees (DT), random forest (RF), artificial neural networks (ANN), support vector machines (SVM), and LSTM. Our findings reveal that LSTM, particularly when combined with tweet text tokenization, stands out as the most effective approach. Furthermore, the study highlights the pivotal role of feature selection, showcasing how TF-IDF features significantly bolster the performance of SVM and LSTM, achieving an impressive accuracy exceeding 98%. These results underscore the potential of advanced NLP applications in real-world settings, paving the way for nuanced and effective analysis of public health discourse on social media.


Download File

[img] Text
116987.pdf - Published Version
Available under License Creative Commons Attribution Share Alike.

Download (551kB)
Official URL or Download Paper: https://beei.org/index.php/EEI/article/view/7855

Additional Metadata

Item Type: Article
Divisions: Faculty of Computer Science and Information Technology
DOI Number: https://doi.org/10.11591/eei.v13i6.7855
Publisher: Institute of Advanced Engineering and Science
Keywords: COVID-19; Feature extraction; Long short-term memory; Sentiment analysis; Text frequency-inverse document frequency
Depositing User: Ms. Nur Aina Ahmad Mustafa
Date Deposited: 22 Apr 2025 04:04
Last Modified: 22 Apr 2025 04:04
Altmetrics: http://www.altmetric.com/details.php?domain=psasir.upm.edu.my&doi=10.11591/eei.v13i6.7855
URI: http://psasir.upm.edu.my/id/eprint/116987
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item