Exploring the Impact of Back-Translation on BERT's Performance in Sentiment Analysis of Code-Mixed Language Data

Nisrina Hanifa Setiono(1), Yunita Sari(2*)
(1) Gadjah Mada University
(2) Gadjah Mada University
(*) Corresponding Author
Abstract
Keywords
Full Text:
PDFReferences
Patwardhan, V., Takawane, G., Kelkar, N., Gaikwad, O., Saraf, R., & Sonawane, S. (2023). Analysing The Sentiments Of Marathi-English Code-Mixed Social Media Data Using Machine Learning Techniques. 2023 International Conference on Emerging Smart Computing and Informatics, ESCI 2023. https://doi.org/10.1109/ESCI56872.2023.10100304 [2] Widya Astuti, L., & Sari, Y. (2023). Code-Mixed Sentiment Analysis using Transformer for Twitter Social Media Data. In IJACSA) International Journal of Advanced Computer Science and Applications (Vol. 14, Issue 10). www.ijacsa.thesai.org [3] Najiha, H., & Romadhony, A. (2023). Sentiment Analysis on Indonesian-Sundanese Code-Mixed Data. 2023 IEEE 8th International Conference for Convergence in Technology, I2CT 2023. https://doi.org/10.1109/I2CT57861.2023.10126254 [4] Patil, A., Patwardhan, V., Phaltankar, A., Takawane, G., & Joshi, R. (2023). Comparative Study of Pre-Trained BERT Models for Code-Mixed Hindi-English Data. 2023 IEEE 8th International Conference for Convergence in Technology, I2CT 2023. https://doi.org/10.1109/I2CT57861.2023.10126273. [5] Pota, M., Ventura, M., Catelli, R., & Esposito, M. (2021). An effective bert-based pipeline for twitter sentiment analysis: A case study in Italian. Sensors (Switzerland), 21(1), 1–21. https://doi.org/10.3390/s21010133. [6] Shorten, C., Khoshgoftaar, T. M., & Furht, B. (2021). Text Data Augmentation for Deep Learning. Journal of Big Data, 8(1). https://doi.org/10.1186/s40537-021-00492-0. [7] Sari, Y., & Al Faridzi, F. P. (2023). Unsupervised Text Style Transfer for Authorship Obfuscation in Bahasa Indonesia. IJCCS (Indonesian Journal of Computing and Cybernetics Systems), 17(1), 23. https://doi.org/10.22146/ijccs.79623. [8] Diva Wijaya, A., & Bram, B. (2021). A SOCIOLINGUISTIC ANALYSIS OF INDOGLISH PHENOMENON IN SOUTH JAKARTA (Vol. 4, Issue 4). www.news.okezone.com [9] N. A. Salsabila, Y. A. Winatmoko, A. A. Septiandri, and A. Jamal, “Colloquial Indonesian Lexicon,” in 2018 International Conference on Asian Language Processing (IALP), 2018, pp. 236–239, doi: 10.1109/IALP.2018.8629151. [10] Devlin, J., Chang, M.-W., Lee, K., Google, K. T., & Language, A. I. (n.d.). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. https://github.com/tensorflow/tensor2tensor [11] N. L. Pham and V. V. Nguyen, "Adapting Neural Machine Translation for English-Vietnamese using Google Translate system for Back-translation," 2019 International Conference on Advanced Computing and Applications (ACOMP), 2019, pp. 1-6. [12] Koto, F., Rahimi, A., Lau, J. H., & Baldwin, T. (2020). IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP. http://arxiv.org/abs/2011.00677

Article Metrics


Refbacks
- There are currently no refbacks.
Copyright (c) 2025 IJCCS (Indonesian Journal of Computing and Cybernetics Systems)

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
View My Stats1