Indonesian Stance Analysis of Healthcare News using Sentence Embedding Based on LSTM
Abstract
The uncertainty of health news content, which is spread on social media, raises the need for validation of the truth. One validation approach is to consider the opinion or attitudes of most people, which is called a stance on a topic, whether they support, oppose, or being neutral. This paper proposes a stance analysis model to classify the relationship between sentences so that it can recognize the correlation of the opinion of the writer in the headline of the problem claim. The proposed model uses several Long Short-Term Memory (LSTM), which represent the interrelationship of news for analysis of the relationship between a claim with other news. The formation of word representation vectors is carried out in conjunction with LSTM-based stance classification training. Sentence embedding is done to get the vector representation of sentences with LSTM. Each word in a sentence occupies one time-step in LSTM and the output of the last word is taken as a sentence representation. Based on the results of trials with the Indonesian health-related dataset that was built for this study, the proposed stance classification model was able to achieve an average F1-score value of 71%, with the supporting value 69%, opposing as much as 70%, and neutral 74%.
References
(2017) “Hasil Survey Wabah Hoax Nasional 2017,” [Online], https://mastel.id/hasil-survey-wabah-hoax-nasional-2017/, tanggal akses: 1-Nov-2019.
W. Ferreira dan S.A. Vlachos, “For or Against? Assessing the Evidence for News Headline Claims,” University College London, London, UK, M.Sc. Project, 2015.
S.M. Mohammad, P. Sobhani, dan S. Kiritchenko, “Stance and Sentiment in Tweets,” ACM Trans. Internet Technol., Vol. 17, No. 3, hal 1–23, 2017.
J. Ebrahimi, D. Dou, dan D. Lowd, “A Joint Sentiment-Target-Stance Model for Stance Classification in Tweets,” Proc. of COLING 2016, the 6th International Conference on Computational Lingustics, 2016, hal 2656–2665.
P. Sobhani, S.M. Mohammad, dan S. Kiritchenko, “Detecting Stance in Tweets and Analyzing Its Interaction with Sentiment,” Proc. Fifth Jt. Conf. Lex. Comput. Semant., 2016, hal 159–169.
P. Krejzl, B. Hourová, dan J. Steinberger, “Stance Detection in Online Discussions,” arXiv Prepr., arXiv1701.00504, 2017.
A. Hanselowski, Avinesh PVS, B. Schiller, F. Caspelherr, D. Chaudhuri, C.M. Meyer, dan I. Gurevych, “A Retrospective Analysis of the Fake News Challenge Stance Detection Task,” Proc. of the 27th International Conference on Computational Linguistics, 2018, hal. 1859–1874.
M.H. Purnomo, S. Sumpeno, E.I. Setiawan, dan D. Purwitasari, “Keynote Speaker II: Biomedical Engineering Research in the Social Network Analysis Era: Stance Classification for Analysis of Hoax Medical News in Social Media,” Procedia Computer Science, 2017, Vol. 116, hal. 3-9.
A. Aker, L. Derczynski, dan K. Bontcheva, “Simple Open Stance Classification for Rumour Analysis,” Proc. of Recent Advances in Natural Language Processing, 2017, hal 31–39.
G. Rajendran, P. Poornachandran, dan B. Chitturi, “Deep Learning Model on Stance Classification,” 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI 2017), 2017, hal 2407–2409.
W.-F. Chen dan L.-W. Ku, “UTCNN: a Deep Learning Model of Stance Classification on Social Media Text,” arXiv Prepr., arXiv1611.03599, hal 1635–1645, 2016.
G. Zarrella dan A. Marsh, “Mitre at Semeval-2016 task 6: Transfer Learning for Stance Detection,” Proc. of the 10th International Workshop on Semantic Evaluation (SemEval-2016), 2016, hal. 458–463.
A.M. Hasan (2018) “Info Hoax Soal Kesehatan Paling Banyak Beredar di Masyarakat,” [Online]. https://tirto.id/info-hoax-soal-kesehatanpaling-banyak-beredar-di-masyarakat-cnQZ, tanggal akses: 5-Nov-2019.
E. Kochkina, M. Liakata, dan I. Augenstein, “Turing at SemEval-2017 Task 8: Sequential Approach to Rumour Stance Classification with Branch-LSTM,” Proc. of the 11th International Workshop on Semantic Evaluation (SemEval-2017), 2017, hal. 475–480.
A. Zaini, M.A. Muslim, dan W. Wijono, “Pengelompokan Artikel Berbahasa Indonesia Berdasarkan Struktur Laten Menggunakan Pendekatan Self Organizing Map,” J. Nas. Tek. Elektro dan Teknol. Inf., Vol. 6, No. 3, hal 259–267, 2017.
J. Santoso, A.D.B. Soetiono, E. Setyati, dan E.M. Yuniarno, “Self-Training Naive Bayes Berbasis Word2Vec untuk Kategorisasi Berita Bahasa Indonesia,” J. Nas. Tek. Elektro dan Teknol. Inf., Vol. 7, No. 2, hal 158–166, 2018.
S. Somasundaran dan J. Wiebe, “Recognizing Stances in Ideological On-Line Debates,” Proc. of the NAACL HLT 2010 Workshop on Computational Approches to Analysis and Generation of Emotion in Text, 2010, hal 116–124.
S. Dungs, A. Aker, N. Fuhr, dan K. Bontcheva, “Can Rumour Stance Alone Predict Veracity?,” Proc. of the 27th International Conference on Computational Linguistics, 2018, hal 3360–3370.
D. Mrowca, E. Wang, dan A. Kosson, “Stance Detection for Fake News Identification,” Standford University, Standford, CA, Project Report, 2017.
K. Dey, R. Shrivastava, dan S. Kaushik, “Twitter Stance Detection - A Subjectivity and Sentiment Polarity Inspired Two-Phase Approach,” 2017 IEEE International Conference on Data Mining Workshops, 2017, hal 365–372.
P. Sobhani, D. Inkpen, dan S. Matwin, “From Argumentation Mining to Stance Classification,” 2nd Workshop on Argumentation Mining, 2015, hal 67–77.
W. Ferreira dan A. Vlachos, “Emergent: A Novel Data-set for Stance Classification,” Proc. of NAACL-HLT 2016, 2016, hal 1163–1168.
B. Riedel, I. Augenstein, G.P. Spithourakis, dan S. Riedel, “A Simple but Tough-to-beat Baseline for the Fake News Challenge Stance Detection Task,” arXiv Prepr., arXiv1707.03264, hal 1–6, 2017.
I. Augenstein, T. Rocktäschel, A. Vlachos, dan K. Bontcheva, “Stance Detection with Bidirectional Conditional Encoding,” Proc. of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP-16), 2016, hal 876–885.
A. Zubiaga, E. Kochkina, M. Liakata, R. Procter, dan M. Lukasik,“Stance Classification in Rumours as a Sequential Task Exploiting the Tree Structure of Social Media Conversations,” Proc. of COLING 2016, the 26th International Conference on Computational Linguistics, 2016, hal 2438–2448.
O. Somantri dan M. Khambali, “Feature Selection Klasifikasi Kategori Cerita Pendek Menggunakan Naïve Bayes dan Algoritme Genetika,” J. Nas. Tek. Elektro dan Teknol. Inf., Vol. 6, No. 3, hal 301–306, 2017.
M. Novita (2019) “Studi: Berita Kesehatan di Media Sosial Sebagian Besar Hoax,” [Online], https://gaya.tempo.co/read/1172696/studi-beritakesehatan-di-media-sosial-sebagian-besar-hoax/full&view=ok, tanggal akses: 27-Nov-2019.
C. Juditha, “Interaksi Komunikasi Hoax di Media Sosial serta Antisipasinya,” J. Pekommas, Vol. 3, No. 1, hal. 31-44, 2018.
C. Guggilla, T. Miller, dan I. Gurevych, “CNN- and LSTM-based Claim Classification in Online User Comments,” Proc. of COLING 2016, the 26th Int. Conf. Comput. Linguist, hal 2740–2751, 2016.
A.D. Tahitoe dan D. Purwitasari, “Implementasi Modifikasi Enhanced Confix Stripping Stemmer untuk Bahasa Indonesia dengan Metode Corpus Based Stemming,” Skripsi, Institut Teknologi Sepuluh Nopember, Surabaya, Indonesia, 2010.
D. Wahyudi, T. Susyanto, dan D. Nugroho, “Implementasi dan Analisis Algoritma Stemming Nazief & Adriani dan Porter pada Dokumen Berbahasa Indonesia,” J. Ilm. SINUS, Vol. 15, No. 2, hal 49–56, 2017.
R. Johnson dan T. Zhang, “Semi-supervised Convolutional Neural Networks for Text Categorization via Region Embedding,” Advances in Neural Information Processing Systems (NIPS 2015), 2015, hal 1–9.
H. Palangi, L. Deng, Y. Shen, J. Gao, X. He, J. Chen, X. Song, dan R. Ward, “Deep Sentence Embedding Using Long Short-Term Memory Networks: Analysis and Application to Information Retrieval,” IEEE/ACM Trans. Audio, Speech Lang. Process., Vol. 24, No. 4, hal 694–707, 2016.
N. Srivastava, E. Mansimov, dan R. Salakhutdinov, “Unsupervised Learning of Video Representations using LSTMs,” Proc. of the 32nd International Conference on International Conference on Machine Learning, 2015, hal 843–852.
T. Mikolov, K. Chen, G. Corrado, dan J. Dean, “Distributed Representations of Words and Phrases and their Compositionality,”Advances in Neural Information Processing Systems (NIPS 2013), 2013, hal 1–9.
Y.-C. Chen, Z.-Y. Liu, dan H.-Y. Kao, “IKM at SemEval-2017 Task 8: Convolutional Neural Networks for stance detection and rumor verification,” 11th International Workshop on Semantic Evaluations, 2017, hal 465–469.
© Jurnal Nasional Teknik Elektro dan Teknologi Informasi, under the terms of the Creative Commons Attribution-ShareAlike 4.0 International License.