Szczegóły publikacji
Opis bibliograficzny
Sentiment analysis using machine learning approach based on feature extraction for anxiety detection / Shoffan SAIFULLAH, Rafał DREŻEWSKI, Felix Andika DWIYANTO, Agus Sasmito Aribowo, Yuli Fauziah // W: Computational Science – ICCS 2023 : 23rd international conference : Prague, Czech Republic, July 3–5, 2023 : proceedings, Pt. 2 / eds. Jiří Mikyška [et al.]. — Cham, Switzerland : Springer, cop. 2023. — (Lecture Notes in Computer Science ; ISSN 0302-9743 ; LNCS 14074). — ISBN: 978-3-031-36020-6; e-ISBN: 978-3-031-36021-3. — S. 365–372. — Bibliogr., Abstr. — Publikacja dostępna online od: 2023-06-26. — S. Saifullah - dod. afiliacja: Universitas Pembangunan Nasional Veteran Yogyakarta, Indonesia
Autorzy (5)
- AGHSaifullah Shoffan
- AGHDreżewski Rafał
- AGHDwiyanto Felix Andika
- Aribowo Agus Sasmito
- Fauziah Yuli
Słowa kluczowe
Dane bibliometryczne
ID BaDAP | 147597 |
---|---|
Data dodania do BaDAP | 2023-07-20 |
DOI | 10.1007/978-3-031-36021-3_38 |
Rok publikacji | 2023 |
Typ publikacji | materiały konferencyjne (aut.) |
Otwarty dostęp | |
Wydawca | Springer |
Konferencja | 23rd International Conference on Computational Science |
Czasopismo/seria | Lecture Notes in Computer Science |
Abstract
In this study, selected machine learning (ML) approaches were used to detect anxiety in Indonesian-language YouTube video comments about COVID-19 and the government’s program. The dataset consisted of 9706 comments categorized as positive and negative. The study utilized ML approaches, such as KNN (K-Nearest Neighbors), SVM (Support Vector Machine), DT (Decision Tree), Naïve Bayes (NB), Random Forest (RF), and XG-Boost, to analyze and classify comments as anxious or not anxious. The data was preprocessed by tokenizing, filtering, stemming, tagging, and emoticon conversion. Feature extraction (FE) is performed by CV (count-vectorization), TF-IDF (term frequency-inverse document frequency), Word2Vec (Word Embedding), and HV (Hashing-Vectorizer) algorithms. The 24 of the ML and FE algorithms combinations were used to achieve the best performance in anxiety detection. The combination of RF and CV obtained the best accuracy of 98.4%, which is 14.3% points better than the previous research. In addition, the other ML methods accuracy was above 92% for CV, TF-IDF, and HV, while KNN obtained the lowest accuracy.