2023 14th International Conference on Information, Intelligence, Systems & Applications (IISA)
Download PDF

Abstract

YouTube comments provide a rich source of data to classify consumers' opinions and emotions towards particular products, brands, and social or health related topics. Previous studies applied either lexicon based or machine learning approaches to classify sentiment through social media comments. This study employs a hybrid strategy that combines machine learning and lexicon-based methods, to classify consumers' sentiment towards health care products promoted through social marketing campaigns in YouTube channels. A total of 59,695 records were exported through the YouTube API, from 18 relevant video campaigns. The comments were labeled using general-purpose sentiment lexicons like TextBlob, VADER and Flair. It was found that 66.4 % of the comments were similarly classified from TextBlob and VADER. However, further analysis revealed that TextBlob generated slightly more accurate results than VADER in the differentiated outcomes. A set of machine learning models were selected to classify the sentiments of the comments, including Logistic Regression (LR), Multinomial Naïve Bayes (Multi.NB), Random Forest (RF), Support Vector Machine (SVM), and Stochastic Gradient Descent Classifier (SGD Classifier). Accuracy, precision, recall, and F1-score were calculated to assess the performance of the algorithms. The most accurate scores were generated from SVM and SGD (accuracy = 91 %, F1-score = 0.89), followed by LR (accuracy = 90%, F1-score = 0.88) and RF (accuracy = 87%, F1-score = 0.84). Implications and limitations are discussed in the paper.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles