The Balinese Unicode Text Processing

https://doi.org/10.22146/ijccs.19

Imam Habibi(1*), Rinaldi Munir(2)

(1) Informatics Engineering Department, Bandung Institute of Technology, Ganesha 10 Street Bandung 40132
(2) Informatics Engineering Department, Bandung Institute of Technology, Ganesha 10 Street Bandung 40132
(*) Corresponding Author

Abstract


In principal, the computer only recognizes numbers as the representation of a character. Therefore, there are many encoding systems to allocate these numbers although not all characters are covered. In Europe, every single language even needs more than one encoding system. Hence, a new encoding system known as Unicode has been established to overcome this problem. Unicode provides unique id for each different characters which does not depend on platform, program, and language. Unicode standard has been applied in a number of industries, such as Apple, HP, IBM, JustSystem, Microsoft, Oracle, SAP, Sun, Sybase, and Unisys. In addition, language standards and modern information exchanges such as XML, Java, ECMA Script (JavaScript), LDAP, CORBA 3.0, and WML make use of Unicode as an official tool for implementing ISO/IEC 10646. There are four things to do according to Balinese script: the algorithm of transliteration, searching, sorting, and word boundary analysis (spell checking). To verify the truth of algorithm, some applications are made. These applications can run on Linux/Windows OS platform using J2SDK 1.5 and J2ME WTK2 library. The input and output of the algorithm/application are character sequence that is obtained from keyboard punch and external file. This research produces a module or a library which is able to process the Balinese text based on Unicode standard. The output of this research is the ability, skill, and mastering of 1. Unicode standard (21-bit) as a substitution to ASCII (7-bit) and ISO8859-1 (8-bit) as the former default character set in many applications. 2. The Balinese Unicode text processing algorithm. 3. An experience of working with and learning from an international team that consists of the foremost experts in the area: Michael Everson (Ireland), Peter Constable (Microsoft US), I Made Suatjana, and Ida Bagus Adi Sudewa.

Keywords


Unicode, transliteration, searching, sorting, word boundary analysis, canonical combining class, normalization, and Unicode Collation Element

Full Text:

PDF


References

Galang.Komputerisasi Aksara Bali. Yayasan Bali, Maret 2005
[2] IBM J. Martha.Perangkat Lunak Teks Editor Berhuruf Bali, 1991.
[3] I G. M. Sutjaja.Kamus Sinonim Bahasa Bali,2003
[4] I N. Medra, et al.Pembinaan Bahasa,Aksara,dan Sastra Bali: Pedoman Penulisan Papan Nama dengan Aksara Bali. Dinas Kebudayaan Prop. Dati I Bali, 1996.
[5] I N. Nikanaya.Surat Rekomendasi Nomor 042/84/DISBUD tentang Temu Wicara Aksara Bali, 2005.
[6] M. Everson dan I Made Suatjana.Proposal for Encoding Balinese Script in the UCS,2005.
[7] P. Constable. Microsoft.Comments on Balinese Proposal, L2/05-008, 2005.
[8] R. Gilliam.Unicode Demystified: A Practical Programmer’s Guide to the Encoding Standard.Addison-Wesley Professional, 2003.
[9] The Unicode Consortium. The Unicode Standard, Version 4.0. Addison-Wesley Professional, 2003
[10]Wikipedia. The Free Encyclopedia. Desember 2005
[11]<http://anubis.dkuug.dk/JTC1/SC2/WG2/Bali_Government_n2916.pdf>.
[12]<http://www.babadbali.com/aksarabali>.
[13]<http://www.en.wikipedia.org>



DOI: https://doi.org/10.22146/ijccs.19

Article Metrics

Abstract views : 2103 | views : 2798

Refbacks

  • There are currently no refbacks.




Copyright (c) 2006 IJCCS - Indonesian Journal of Computing and Cybernetics Systems

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.



Copyright of :
IJCCS (Indonesian Journal of Computing and Cybernetics Systems)
ISSN 1978-1520 (print); ISSN 2460-7258 (online)
is a scientific journal the results of Computing
and Cybernetics Systems
A publication of IndoCEISS.
Gedung S1 Ruang 416 FMIPA UGM, Sekip Utara, Yogyakarta 55281
Fax: +62274 555133
email:ijccs.mipa@ugm.ac.id | http://jurnal.ugm.ac.id/ijccs



View My Stats1
View My Stats2