dCollection 디지털 학술정보 유통시스템

Mitigating Class Imbalance in Sentiment Analysis through GPT-3-Generated Synthetic Sentences

주제(키워드) GPT-3 , imbalanced sentiment analysis , sentiment analysis , sentiment classification , synthetics review generation , text classification , text generation
관리정보기술 faculty
등재 SCIE, SCOPUS
OA유형 All Open Access; Gold Open Access
발행기관 Multidisciplinary Digital Publishing Institute (MDPI)
발행년도 2023
총서유형 Journal
URI http://www.dcollection.net/handler/ewha/000000211572
본문언어 영어
Published As https://doi.org/10.3390/app13179766

초록/요약

In this paper, we explore the effectiveness of the GPT-3 model in tackling imbalanced sentiment analysis, focusing on the Coursera online course review dataset that exhibits high imbalance. Training on such skewed datasets often results in a bias towards the majority class, undermining the classification performance for minority sentiments, thereby accentuating the necessity for a balanced dataset. Two primary initiatives were undertaken: (1) synthetic review generation via fine-tuning of the Davinci base model from GPT-3 and (2) sentiment classification utilizing nine models on both imbalanced and balanced datasets. The results indicate that good-quality synthetic reviews substantially enhance sentiment classification performance. Every model demonstrated an improvement in accuracy, with an average increase of approximately 12.76% on the balanced dataset. Among all the models, the Multinomial Naïve Bayes achieved the highest accuracy, registering 75.12% on the balanced dataset. This study underscores the potential of the GPT-3 model as a feasible solution for addressing data imbalance in sentiment analysis and offers significant insights for future research. © 2023 by the authors.

반출 Meta View 목록

검색 상세

Mitigating Class Imbalance in Sentiment Analysis through GPT-3-Generated Synthetic Sentences

초록/요약