Normalisasi Kata Tidak Baku yang Tidak Disingkat dengan Jarak Perubahan

  • I Gusti Bagus Baskara Nugraha Institut Teknologi Bandung
  • Rafi Dwi Rizqullah Institut Teknologi Bandung
Keywords: voice assistant, kamus, kata tidak baku, normalisasi, jarak Levenshtein, jarak Jaro-Winkler


Voice assistant technology is growing rapidly and its use has begun to spread to daily use. However, voice assistant usages are still limited to standard conversation languages. Meanwhile, Indonesian people are accustomed to informal language in daily conversation. This research gives solution to overcome the problem of voice assistants with informal words or words that will not be found in formal word dictionary. We propose text normalization using Levenshtein distance. Test result shows that normalization using Levenshtein distance outperform the normalization using Longest Common Subsequence (LCS) distance with accuracy difference of 8.34%.


I Gusti Bagus Baskara Nugraha, & Rafi Dwi Rizqullah. (2019). Normalisasi Kata Tidak Baku yang Tidak Disingkat dengan Jarak Perubahan. Jurnal Nasional Teknik Elektro Dan Teknologi Informasi, 8(3), 218-224.