Machine Translation Indonesian Bengkulu Malay Using Neural Machine Translation-LSTM

https://doi.org/10.22146/ijccs.98384

Bella Okta Sari Miranda(1), Herman Yuliansyah(2*), Muhammad Kunta Biddinika(3)

(1) Master of Informatic Universitas Ahmad Dahlan, Yogyakarta
(2) Department of Informatics, Universitas Ahmad Dahlan,Yogyakarta
(3) Master of Informatic Universitas Ahmad Dahlan, Yogyakarta
(*) Corresponding Author

Abstract


The machine translator is an application in Natural Language Processing (NLP) that focuses on translating between languages. Several previous research have used Statistical Machine Translation (SMT) with a parallel corpus of Indonesian and Bengkulu Malay totaling 3000 data points. However, SMT performs poorly when confronted with limited data and infrequent language pairs. Therefore, this study aims to build a machine translation model from Indonesian to Bengkulu Malay using an NMT approach with Long Short-Term Memory (LSTM), and to create a parallel corpus of 5261 data pairs between Indonesian and Bengkulu Malay. The research was conducted in three stages: data collection, data preprocessing, training and modeling, and evaluation. The performance of the machine translator was evaluated using the Bilingual Evaluation Understudy (BLEU). The evaluation results show that this model achieved the highest average score of 0.6016332 on BLEU-1 and the lowest average score of 0.3680788 on BLEU-4. These results indicate that considering the natural linguistic structural differences between Indonesian and Bengkulu Malay can be suggested as the best solution for translating from Indonesian to Bengkulu Malay.



Keywords


Bengkulu Malay Language; BLEU; NMT; Parallel Corpus; LSTM

Full Text:

PDF


References

[1] F. Senovil, “Morfofonemik Bahasa Melayu Bengkulu,” KLITIKA J. Ilm. Pendidik. Bhs. dan Sastra Indones., vol. 2, no. 2, pp. 165–178, 2020, doi: https://doi.org/10.32585/klitika.v2i2.1037.

[2] N. H. M. Ningsih, D. E. C. Wardhana, and S. Supadi, “Derivasi Bahasa Melayu Bengkulu,” J. Ilm. KORPUS, vol. 4, no. 2, pp. 224–230, 2020, doi: 10.33369/jik.v4i2.8361.

[3] J. Zakaria, I. Yuniati, and E. F. Wijaya, “Implikatur Tegur Sapa Dalam Bahasa Melayu Bengkulu,” Lit. J. Bahasa, Sastra dn Pengajaran, vol. 1, no. 2, pp. 74–78, 2021, doi: https://doi.org/10.31539/literatur.v1i2.2401.

[4] M. Stasimioti, V. Sosoni, D. Mouratidis, and K. Kermanidis, “Machine Translation Quality: A comparative evaluation of SMT, NMT and tailored-NMT outputs,” Proc. 22nd Annu. Conf. Eur. Assoc. Mach. Transl. EAMT 2020, pp. 441–450, 2020.

[5] A. Garg and M. Agarwal, “Machine Translation : A Literature Review.”

[6] D. Soyusiawaty and B. O. S. Miranda, “Statistical Machine Translation from Indonesian to Regional Languages in Indonesia,” Int. J. Comput. Appl., vol. 184, no. 49, pp. 18–23, 2023, doi: 10.5120/ijca2023922603.

[7] F. Rahutomo, A. A. Septarina, M. Sarosa, A. Setiawan, and M. M. Huda, “A review on Indonesian machine translation,” J. Phys. Conf. Ser., vol. 1402, no. 7, 2019, doi: 10.1088/1742-6596/1402/7/077040.

[8] Z. Tan, S. Wang, Z. Yang, G. Chen, and X. Huang, “Neural machine translation : A review of methods , resources , and tools,” AI Open, vol. 1, no. October 2020, pp. 5–21, 2021, doi: 10.1016/j.aiopen.2020.11.001.

[9] Y. Fauziyah, R. Ilyas, and F. Kasyidi, “Mesin Penterjemah Bahasa Indonesia-Bahasa Sunda Menggunakan Recurrent Neural Networks,” J. Teknoinfo, vol. 16, no. 2, pp. 313–322, 2022, doi: https://doi.org/10.33365/jti.v16i2.1930.

[10] I. G. A. Budaya, M. W. A. Kesiman, and I. M. G. Sunarya, “Perancangan Mesin Translasi berbasis Neural dari Bahasa Kawi ke dalam Bahasa Indonesia menggunakan Microframework Flask,” J. Sist. dan Inform., pp. 94–103, 2022.

[11] Z. Abidin, A. Sucipto, and A. Budiman, “Penerjemahan Kalimat Bahasa Lampung-Indonesia Dengan Pendekatan Neural Machine Translation Berbasis Attention Translation of Sentence Lampung-Indonesian Languages With Neural Machine Translation Attention Based,” J. Kelitbangan, vol. 06, no. 02, pp. 191–206, 2018.

[12] F. Razsiah, A. Josi, and S. Mubaroh, “Aplikasi Penerjemah Bahasa Bangka Ke Bahasa Indonesia Menggunakan Neural Machine Translation Berbasis Website,” J. Inov. Teknol. Terap., vol. 1, no. 1, pp. 68–76, 2023, doi: 10.33504/jitt.v1i1.67.

[13] L. B. San and H. Sujaini, “Uji Nilai Akurasi pada Neural Machine Translation ( NMT ) Bahasa Indonesia ke Bahasa Tiochiu Pontianak dengan Mekanisme Attention,” vol. 9, no. 3, pp. 362–370, 2023. doi: https://dx.doi.org/10.26418/jp.v9i3.63346

[14] D. A. Sulistyo, A. P. Wibawa, D. D. Prasetya, and F. A. Ahda, “LSTM-Based Machine Translation for Madurese-Indonesian,” J. Appl. Data Sci., vol. 4, no. 3, pp. 190–199, 2023, doi: 10.47738/jads.v4i3.113.

[15] K. Dedes et al., “Neural Machine Translation of Spanish-English Food Recipes UsingLSTM,” Int. J. Informatics Vis., vol. 6, no. June, pp. 290–297, 2022, [Online]. Available: www.joiv.org/index.php/joiv. doi: https://dx.doi.org/10.30630/joiv.6.2.804

[16] Q. A. Agigi and A. A. Suryani, “Statistical Machine Translation Muna to Indonesia Language,” J. Tek. Inform. dan Sist. Inf., vol. 8, no. 4, pp. 2173–2186, 2021, doi: 10.35957/jatisi.v8i4.1149.

[17] M. S. Alam and A. A. Suryani, “Minang and Indonesian Phrase-Based Statistical Machine Translation,” J. Informatics Telecommun. Eng., vol. 5, no. 1, pp. 216–224, 2021, doi: https://doi.org/10.31289/jite.v5i1.5308.

[18] S. E. Sitepu, U. Satya, and T. Bhinneka, “Low-Resource Single-Domain Machine Translation untuk Bahasa Karo-Indonesia Pendahuluan,” vol. 1, no. 4, pp. 59–66, 2023.doi: https://doi.org/10.31004/ijme.v1i4.21

[19] D. Torregrosa et al., “Leveraging Rule-Based Machine Translation Knowledge for Under-Resourced Neural Machine Translation Models,” Proc. Mach. Transl. Summit XVII Transl. Proj. User Tracks, vol. 2, pp. 125–133, 2019, [Online]. Available: https://aclanthology.org/W19-6725.pdf.

[20] J. Daems and L. Macken, “Interactive adaptive SMT versus interactive adaptive NMT: a user experience evaluation,” Mach. Transl., vol. 33, no. 1–2, pp. 117–134, 2019, doi: 10.1007/s10590-019-09230-z.

[21] Y. Liu, J. Gu, N. Goyal, X. Li, and S. Edunov, “Multilingual Denoising Pre-training for Neural Machine Translation,” vol. 8, pp. 726–742, 2020.

[22] Z. Yu, Z. Yu, J. Guo, Y. Huang, and Y. Wen, “Efficient Low-Resource Neural Machine Translation with,” vol. 19, no. 3, pp. 1–13, 2020.

[23] D. Puspitaningrum, “A Study of English-Indonesian Neural Machine Translation with Attention (Seq2Seq, ConvSeq2Seq, RNN, and MHA): A Comparative Study of NMT on English-Indonesian,” ACM Int. Conf. Proceeding Ser., pp. 271–280, 2021, doi: 10.1145/3479645.3479703.

[24] D. Datta, P. E. David, D. Mittal, and A. Jain, “Neural Machine Translation using Recurrent Neural Network,” Int. J. Eng. Adv. Technol., vol. 9, no. 4, pp. 1395–1400, 2020, doi: 10.35940/ijeat.d7637.049420.

[25] S. Iida, R. Kimura, H. Cui, P.-H. Hung, T. Utsuro, and M. Nagata, “A Multi-Hop Attention for RNN based Neural Machine Translation,” Proc. 8th Work. Pat. Sci. Lit. Transl., vol. 2018, pp. 24–31, 2019, [Online]. Available: https://aclanthology.org/W19-7203.

[26] A. Shewalkar, D. nyavanandi, and S. A. Ludwig, “Performance Evaluation of Deep neural networks Applied to Speech Recognition: Rnn, LSTM and GRU,” J. Artif. Intell. Soft Comput. Res., vol. 9, no. 4, pp. 235–245, 2019, doi: 10.2478/jaiscr-2019-0006.

[27] Y. Dong, “RNN Neural Network Model for Chinese-Korean Translation Learning,” Secur. Commun. Networks, vol. 2022, 2022, doi: 10.1155/2022/6848847.

[28] R. Achmad, Y. Tokoro, J. Haurissa, and A. Wijanarko, “Recurrent Neural Network-Gated Recurrent Unit for Indonesia-Sentani Papua Machine Translation,” J. Inf. Syst. Informatics, vol. 5, no. 4, pp. 1449–1460, 2023, doi: 10.51519/journalisi.v5i4.597.

[29] M. Wahyuni, H. Sujaini, and H. Muhardi, “Pengaruh Kuantitas Korpus Monolingual Terhadap Akurasi Mesin Penerjemah Statistik,” J. Sist. dan Teknol. Inf., vol. 7, no. 1, pp. 20–26, 2019, doi: https://dx.doi.org/10.26418/justin.v7i1.27241.

[30] B. Premjith, M. A. Kumar, and K. P. Soman, “Neural machine translation system for English to Indian language translation using MTIL parallel corpus,” J. Intell. Syst., vol. 28, no. 3, pp. 387–398, 2019, doi: 10.1515/jisys-2019-2510.



DOI: https://doi.org/10.22146/ijccs.98384

Article Metrics

Abstract views : 954 | views : 756

Refbacks

  • There are currently no refbacks.




Copyright (c) 2024 IJCCS (Indonesian Journal of Computing and Cybernetics Systems)

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.



Copyright of :
IJCCS (Indonesian Journal of Computing and Cybernetics Systems)
ISSN 1978-1520 (print); ISSN 2460-7258 (online)
is a scientific journal the results of Computing
and Cybernetics Systems
A publication of IndoCEISS.
Gedung S1 Ruang 416 FMIPA UGM, Sekip Utara, Yogyakarta 55281
Fax: +62274 555133
email:ijccs.mipa@ugm.ac.id | http://jurnal.ugm.ac.id/ijccs



View My Stats1
View My Stats2