Identification of Incung Characters (Kerinci) to Latin Characters Using Convolutional Neural Network

melakukan pengenalan karakter aksara Incung (Kerinci) dengan output berupa karakter latin dari aksara incung. Metode klasifikasi yang digunakan adalah metode Convolutional Neural Network (CNN). Dataset yang digunakan sebanyak 1400 citra karakter aksara Incung yang dibagi kedalam 28 kelas.Pada penelitian dilakukan percobaan untuk mendapatkan model yang paling optimal. Menunjukkan hasil dengan menggunakan metode CNN melakukan proses training mendapatkan data training 99% Abstract Incung script is a legacy of the Kerinci tribe located in Kerinci Regency, Jambi Province. On October 17, 2014, the Incung script was designated by the Ministry of Education and Culture as an intangible heritage property owned by Jambi Province. But in reality, the Incung script is almost extinct in society. This study aims to identify the characters of the Incung (Kerinci) script with the output in the form of Latin characters from the Incung script. The classification method used is the Convolutional Neural Network (CNN) method. The dataset used as many as 1400 incung character images divided into 28 classes. In this study, an experiment was conducted to obtain the most optimal model. Showing the results using the CNN method during the training process that the accuracy of the training data reaches 99% and the accuracy of the testing data reaches 91% by using the optimal hyperparameters from the tests that have been done, namely batch size 32, epoch 100, and Adam's optimizer. It evaluates the CNN model using 80 images in words (a combination of several characters) with 4 test scenarios. It shows that the model can recognize image data from scanning printed books, digital writing test data, test data with images containing more than two characters, and check images with different font sizes. Neural Network, Deep Learning, Image, Pattern Recognition.


INTRODUCTION
The Incung script is a heritage script that is a form of diversity from the culture of the Kerinci civilization in Kerinci Regency, Jambi Province. In the past, used the Incung script to write literature, customary law, and spells written on containers in bark, cow horn, buffalo horn, bamboo, and palm leaves [1]. The Incung (Kerinci) script is the only script from central Sumatra whose existence needs to be preserved. On October 17, 2014, the Incung script was designated by the Ministry of Education and Culture (Ministry of Education and Culture) as an intangible heritage property owned by Jambi Province. However, in reality, the Incung script is almost extinct in society. It can seem that at least the people of Kerinci know and understand the Incung script as a cultural identity.
One of the steps to preserve the Incung script is through a technological approach. In this case, by introducing the incung character pattern by utilizing deep learning technology in image processing or image processing. Pattern recognition (Pattern Recognition) is one of the developments of artificial intelligence (Artificial Intelligence), which can recognize character patterns recognized by humans and then processed by computers [2].
The technology that currently has excellent capabilities in the case of object classification in images or pattern recognition today is deep learning technology [3], and it is at the heart of the current rise of artificial intelligence [4]. Deep Learning is machine learning that allows computers to learn from experience and understand commands based on given concepts [5]. The existence of deep learning has changed the paradigm of pattern recognition research, which previously separated feature extraction and classification methods that were carried out separately [6].
Character pattern recognition has become the focus of research with various purposes and different methods. Like the Javanese script research conducted by [7] using the Convolutional Neural Network (CNN) and Deep Neural Network (DNN) methods as a comparison with an accuracy rate of 70.22% for the CNN method and 64.65% for the DNN method. Then research by [8] used the CNN method and compared it with the MLP method, where CNN produced an accuracy of 89% while MLP only reached 62%. Research by [9] regarding handwritten characters of hiragana and katakana using the CNN method obtained an accuracy of 87.82%, while using a combination of CNN and SVM obtained an accuracy of 88.21%. Research by [10] by trying to recognize Javanese characters in the form of words using CNN resulted in 99.62% accuracy in the training process, and the segmentation process using a combination of projection profile and connected component labeling with 90% accuracy, and when testing using 20 new test data. The form of words (several various characters) obtained 80% accuracy.
From research on character pattern recognition that has been done before, the results of the average value of higher accuracy were obtained using the Convolutional Neural Network (CNN) method. The CNN method tries to imitate the image recognition system in the human visual cortex to process image information like humans [11] [12]. In addition, the CNN method is a method found in Deep Learning which makes computer vision develop very rapidly [13]. This CNN capability is also stated as the best model to solve the problem of object recognition, pattern recognition [12], and face identification [14].

Dataset
This study uses a dataset containing training and testing images. This dataset contains The data collection process is carried out in two ways: Collecting Incung script data from books [15] [16] and creating a digital image dataset of the Incung script with reference sources and validated by the Siginjei Jambi Museum and the Incung School Community.
The data obtained from the book will go through a scanning process or data scanning first to then be used as a digital image. The scan results from the data are manually cropped to separate the data into characters (one letter) to become data ready to be processed. Scanned 425 image data from books and 975 digital writing data from the data collection process, so the total dataset used was 1400 images.

Data Preprocessing
The data preprocessing stage is done by resetting the image size to 45 x 45 pixels and setting the color mode to grayscale, which reduces computation time because the image matrix is only processed in one channel [17]. After preprocessing the data, divide the 1400 data into training and testing data. The data sharing scenario used in this study was randomly divided into 80% or 1120 data for training data and 20% or 280 data for testing data [18]. The training data is larger than the testing data because the training data is the data used for model training. While the testing data is used to measure how well the model performs on the new data (data that does not exist in the training data).

Data Analysis Technique
The data analysis technique used in this study is the Convolutional Neural Network algorithm. Figure 2 below is a data analysis technique workflow. Hyperparameters determine the overall structure of the network, but no standard approach to identify the optimal hyperparameters configuration [19]. For this reason, hyperparameter testing is carried out to compare the model by conducting a test scheme with different data scenarios to obtain the best accuracy results from the model. In this study, the hyperparameters that will use for the experiment are: 1. Batch size with 32, 64, and 128 data scenarios. 2. The epoch with data scenarios is 10 to 100. 3. The optimizers used are Adam, Adamax, and RMSProp.

Convolutional Neural Network
The structure of the CNN model is used in Table 1 below. The network consists of an input layer, three convolutional layers, three pooling layers, two fully connected layers, and an output layer. The input layer uses an image that is 45x45 pixels in size. The input data is then processed in the first convolution layer using 2D convolution with 16 filters, the same padding or the same padding as the input data padding so that the input in the first convolution layer remains 45x45 [20], kernel with 3x3 matrix, activation using ReLu, and Max Pooling 2x2. The second layer uses 2D convolution with 96 filters and the third layer uses 2D convolution with 128 filters using the same settings as the first layer. The next layer is a flatten layer used to convert the dimensions of the feature map into a one-dimensional vector value [21] which is used to input a dense layer with a vector number of 480 and uses ReLu activation. The dense layer adds a fully connected layer at the end of the network [22], and the dropout value used is 0.18. It was followed by a second dense layer using vector 64, activation ReLu, and a dropout value of 0.06. The last layer is a dense layer with a vector value of 28 using softmax activation.

Design Of Image Testing
At the image testing stage, the model is tested with different data to determine whether the model has produced a good performance in classifying Incung script images. Figure 3 is the image testing process. This testing process is carried out in several stages. The first stage is to load the previously created model enter the image to test in the form of words (a combination of several characters) for further segmentation process on the input image or image cutting to separate the characters so that can detect the characters according to the CNN model. The last stage is the prediction process or the results of image classification, and then the prediction results will come out in the form of Latin characters from the incung script.

Test Results using Hyperparameter
Hyperparameter testing is done to compare which model produces the best level of accuracy. The hyperparameters that will be tested are the influence of Batch size, epoch, and Optimizer values. This test is crucial for using its magnitude in the model. The training process has different accuracy and loss results even though the resulting values are not significantly different. However, the training results for each hyperparameter have a reasonably significant difference when tested. The following is a table of hyperparameter test results: Based on Table 2 shows a summary of the results of hyperparameter testing. The epoch test carried out with each epoch 10 to 100 showed the most optimal accuracy value at epoch 100. In experiment using adam's optimizer and batch size 128 obtained the highest accuracy at epoch 90. While, at epoch 100 the accuracy decreased.
Then the results from the accuracy value obtained showed that the most optimal optimizer was using the Adam optimizer with an accuracy value of batch size 32 was 0.995. batch size 64 is 0.984, and batch size 128 is 0.917. The batch size results show the most optimal value in batch size 32, which means that the smaller the batch size value, the higher the accuracy value obtained.

Training Result
After going through several processes in the Convolutional Neural Network (CNN) and testing hyperparameters to determine the best model, the results obtained are accurate in the Meanwhile, the yellow graph shows the data testing results, namely the data used to measure how well the model performs on the new data. The graph shows the results of the data testing accuracy, reaching 0.9179 or 91% using 280 testing data.

Confusion Matrix
The testing data results will then be summarised in a confusion matrix shown in the following figure. According to the accuracy graph during the training model, the accuracy value of the data test is 91.78%. The precision, recall, and f1-score values are calculated in Table 3 below based on the confusion matrix.  Table 3, the accuracy of the precision value or the level of accuracy between the requested data and the results provided by the model is 0.93. The recall value, the model's success rate in retrieving information, is 0.92, and the f1-score value, a single parameter measure of success retrieval that combines recall and precision, is 0.92.

Image Test
In model testing, the first step is to load the previously saved training model. Then input the image data to be tested in the form of an Incung script image read "NAMA" as shown in Figure 6.  Figure 6 New test data Figure 6 is the test data input which will then enter into the segmentation process or character separation using a vertical projection profile. The most necessary thing in the projection profile method is to get a histogram of the image shown in Figure 7 below. This segmentation process will detect the white color of each column. If white is seen, it will be determined as the initial limit for cutting characters, if black is seen, it will be defined as the end limit for cutting characters. The results of character segmentation using a vertical projection profile are shown in Figure 8 which separates the characters in order from left to right:

Figure 8 Segmentation Results using Projection Profile
After successfully separating the characters, as shown in Figure 8, the next step is to classify the labels using CNN. The test data produces an array ['NA', 'MA']. The array classification process does not go through the cropping or character separation process when the tested image has only one character. The display of the test results can be seen in Figure 9.

Graphic User Interface
Graphic User Interface (GUI) facilitates testing images in incung script recognition. The GUI was created in a web form using the Django framework in this research. The system that has been built has three pages, namely the home page that contains a description of the system, the Incung Script page that includes the appearance of images of Incung scripts and examples of the use of Incung script in word form, and the Incung script recognition page which is the core page that is to test previously created models using new Incung script image data.

Evaluation Model
After the CNN modeling process and obtaining the best accuracy value, the model is tested to validate various scenarios. It will try the evaluation of this model with four different scenarios. The test is carried out using test data of 80 different images, where each test scenario uses 20 data. The purpose of testing is to determine the level of effectiveness and accuracy of this study using new data.     Table 7 is a sample test on image data over two characters. Did this test to see if the system can break down characters and classify more than two incung characters.

CONCLUSIONS
Based on research that has been done has succeeded in making a CNN model that will be used for incung script image testing. Hyperparameter testing mechanisms are performed to obtain optimal CNN models by testing on batch size, epoch, and optimizer using multiple data scenarios with trial and error methods. This research got the best accuracy rate results in batch size is 32, epoch 100, and Adam optimizer obtained an accuracy rate of 99% in data training and 91% in data testing. Evaluation of the model made using test data in the form of words (combined characters) incung script obtained good results in the introduction of the Incung script. This model can recognize image data from scanning printed books, digital writing test data, test data with imagery containing more than two characters, and recognize images with various letter sizes. It shows that the model that has been created by applying the Convolutional Neural Network method can perform incung script recognition well.