Implementation of Genetic Algorithms and Momentum Backpropagation in Classification of Subtype Cells Acute Myeloid Leukimia

Acute Myeloid Leukimia (AML) is a type of cancer which attacks white blood cells from myeloid. AML subtypes M1, M2, and M3 are affected by the same type of cells called myeloblasts, so it needs more detailed analysis to classify.Momentum Backpropagation is used to classified. In its application, optimal selection of architecture, learning rate, and momentum is still done by random trial. This is one of the disadvantage of Momentum Backpropagation. This study uses a genetic algorithm (GA) as an optimization method to get the best architecture, learning rate, and momentum of artificial neural network. Genetic algorithms are one of the optimization techniques that emulate the process of biological evolution. The dataset used in this study is numerical feature data resulting from the segmentation of white blood cell images taken from previous studies which has been done by Nurcahya Pradana Taufik Prakisya. Based on these data, an evaluation of the Momentum Backpropagation process was conducted the selection parameter in a random trial with the genetic algorithm. Furthermore, the comparison of accuracy values was carried out as an  ISSN (print): 1978-1520, ISSN (online): 2460-7258 IJCCS Vol. 14, No. 2, April 2020 : 189 – 198 190 alternative to the ANN learning method that was able to provide more accurate values with the data used in this study. The results showed that training and testing with genetic algorithm optimization of ANN parameters resulted in an average memorization accuracy of 83.38% and validation accuracy of 94.3%. Whereas in other ways, training and testing with momentum backpropagation random trial resulted in an average memorization accuracy of 76.09% and validation accuracy of 88.22%. Keywords— Acute Myeloid Leukimia (AML), Neural Network, Momentum Backpropagation, Genetic Algorithm


INTRODUCTION
Acute myeloid leukimia (AML) is characterized by an increase in the number of myeloid cells in the marrow and an arrest in their maturation, frequently resulting in hematopoietic insufficiency (granulocytopenia, thrombocytopenia, or anemia), with or without leukocytosis [1]. Five-year survival rates during this period were less than 15 percent. Over the past decade, refinements in the diagnosis of subtypes of AML and advances in therapeutic approaches have improved the outlook for patients with AML. Despite these improvements, however, the survival rate among patients who are less than 65 years of age is only 40 percent. The primary reason for the outbreak of this cancer is still a mystery. Moreover, weakness, fever, tiredness or pain in joints and/or bones are also symptoms associated with AML just like other common ailments. Since the cancer is acute, it is even more important to detect it while it is in its primary stages of growing. Thus, it is very important to have a system that can detect AML accurately [2].
Artificial Neural Network (ANN) provide main features, such as : flexibility, competence, and capability to simplify and solve problems in pattern classification, function approximation, pattern matching and associative memories [3]. ANN has the aptitude for random nonlinear function approximation and information processing which other methods does not have. Different techniques are used in the past for optimal network performance for training ANNs such as backpropagation neural network (BPNN) algorithm. However, the BPNN algorithm suffers from two major drawbcaks: low convergance rate and instability. The drawbacks are caused by a risk of being trapped in a local minimum, and possibility of overshooting the minimum of the error. Another weakness found that is in the application of high levels of difficulty and complex combinations of the given criteria, namely learning speed, size, generalization ability, and resistance to data disturbances and increasing size and complexity, making artificial neural networks trapped in local optimal [4].
The combination of architectural parameters, initial weights, and initial biases greatly determines the learning ability of artificial neural networks [5]. Predictions generated by ANN so far there are no standard rules regarding how many optimal hidden layers. Each case certainly has a different number of hidden layers that will be used to get the optimal solution. In addition to some of the parameters previously mentioned, an increase in the value of learning rate can cause an increase in the speed of training in reaching the point of convergence but can also reduce the value of predictive analysis, especially in the value of precision in each class. The trial and error method is usually used in finding the highest parameter values used in learning ANN to get the highest accuracy, precision, sensitivity, and specificity [6].
Evolutionary computation is often used to train the parameter of neural network. In recent years, many improved learning algorithms have been proposed to overcome the weakness of gradient-based techniques. Genetic algorithms are widely used for optimization problems in artificial neural networks, for example research related to the diagnosis of breast cancer using genetic algorithms [7]. The same thing was done in detecting skin cancer. The combination of genetic algorithms and artificial neural networks can also provide better accuracy results than the methods used previously [8]. Genetic algorithms can be hybridized with other algorithms. For example, gradientbased methods can be used to enhance the performance of genetic algorithms. The global search capability of genetic algoithms is used to ensure a high probability of finding global optimality, whereas the derivatives or local information can be used to speed up local search. For this approach, genetic algorithms can be combined with various other algorithms to use the advantage of both algorithms [9].

METHODS
In this section, the proposed method is explained in detail. This Includes data descriptions, preprocessing of data, and methods in classifying subtype cells myeloblast, promyelocyte, monoblast, and support cell by using hybrid genetic algorithms backpropagation.

Subtype cells AML (Myeloblast, Promyelosit, Monoblast, and Support)
Acute Myeloid Leukemia (AML) is a type of cancer that develops rapidly which attacks blood cells and spinal cord. The method often used in AML classification was developed by French-American-British (FAB), which classifies AML nine subtypes, namely M0, M1, M2, M3, M4, M4Eo, M5, M6, M7 [1]. The characteristic of each cell can be shown in Table I. There is an atypical granular promyelocyte with cytoplasm filled with Auer-rods

Data Description
Sources of data in this study using research data on previous researchers [10]. The data used is in the form of discrete data from segmented image features. The data used in this study has six features consisting of area, edge area or perimeter, roundness, nucleus ratio, mean, and standard deviation.

Preprocessing Data (Data Normalization)
The classification data used in this study is numerical data consisting of six feature parameters, namely area area, edge area, cell roundness, nucleus ratio, mean, and standard deviation. The data that has been obtained then passes the normalization process, this is necessary because the extraction results have a variety of values. The range of feature values can be described : area and edge area have a range of integer (integer) values, The roundness and nucleus ratio has a range of real number values between 0 to 1, mean and standard deviations are real numbers with a range of values ranging from -to 255. All feature data will be normalized, before entering the training process, use (1) (1) where: = The value to-i before normalized = The value to-i after normalized = The minimum value of the data = The maximum value of the data

Backpropagation
The very general nature of backpropagation training method means that a backpropagation net (a multilayer, feedforward net trained by backpropagation) can be used to solve problems in many areas [11]. In tis study Backpropagation model can be grouped into three layers, namely, input layer, hidden layer and output layer, Fig 1 represents the architecture. Figure 1 Represents the architecture ANN modeling is divided into two stages (i.e. training and testing). The first part of the modeling is the training stage in which formulation of the initial structure of ANN is executed. Subsequently, validation is stage to ensure teh accuracy of the final model. In addition, the datasets are distributed using k-fold cross validation, k=3.

Genetic Algorithm
There are many advantages of genetic algorithms over traditional optimization algorithms [9]. GA is a method for solving both constrained and unconstrained optimization problems. The key concept of GA mechanism bases on natural selection, the process that drives biological evolution. The method begins with a set of individuals, called an initial population. GA repeatedly modifies a population of individual solutions. At each generation, GA selects individuals from the current population to be parents and uses them to produce the children for the next generation. Over successive generation, the population evolves to an optimal solution.  The individual shown in Figure 2, an individual will have a hidden layer structure of 1 to 3 with neurons in each layer having a different number with a range of 0 to 100 units. Chromosomes will be divided into three parts, 1 gene for learning rate, 1 gene for momentum, and 1-3 non-fix genes that indicate the number of neuron in each hidden layer. In the third part of the chromosome the denormalization process will be carried out to get the number of neurons in each hidden layer. Calculation of denormalization on chromosome part 3 can use the formula (2). (2) where : = normalization data = Chromosomes i

Fitness Function or Evaluation Function
The objective function of genetic algorithm in this study is to minimize errors. Fitness values state how an individual can be the solution of the problem that is defined. This fitness function can be used to see which individual is producing the smallest error value. Fitness function calculations can be obtained from the average root value of errors in the system or commonly called MSE (Mean Square Error). Optimal value is obtained by getting the smallest MSE value so that the greater the value of fitness. For optimal value problems, the fitness function shown (3).

Fitness =
Based on the results of the fitness function: learning rate, momentum, the number of hidden layers, and the number of neurons in each hidden layer will be obtained that corresponds to the individual from the calculation process.
In our framework based on GA, we have employed the following algorithm/methods Roulette Wheel Selection, Whole Arithmatic Crossover, flip mutation and Elitism.

Termination criteria
In this enhanced GA Backpropagation approach, the learning /evolutionary process is terminated, if it meets the condition that number of fitness evaluation reaches its maximum count.

Process
In this study will be constructed model of architecture based on these data, an evaluation of the Momentum Backpropagation process was conducted the selection of a random trial parameter with the genetic algorithm. Genetic algorithm method used to analyze and classify myeloblast, promyelocyte, monoblast, and support cell to determine the number of hidden layers, number of neurons, learning rate, and momentum in backpropagation neural networks. The expected output is to get the best-classified result value with the smallest MSE value. The best-classified value results can be obtained by testing the amount of data, as well as testing on the genetic parameters: population size, crossover probability, and mutation probability.

Figure 3 Flowchart diagram for Myeloblast, Promyelosit, Monoblast, and Support with Genetic Algorithm
The first step is to specify the input data to be used, then preprocessing the data using the Min-Max (0 -1) normalization. Training data and normalised test data for easy calculation. The next step is to determine the type of kernel and the value of the parameters to be used, then the process of initializing parameters the user is given the choice to determine the architecture manually or with genetic algorithms. This process is the focus of research where the results of these two methods will be compared to be able to see the difference in terms of accuracy with the aim of seeing whether this method can provide better learning ability than standard methods without GA modification. After sequential process of selecting parameters with GA, population initialization is done by arousing individuals with real representation chromosomes. After that, the calculation of fitness values by looking at the accuracy of the training process ANN which represents each individual in the population that has been raised. Individuals who have met maximum epoch will be stored for later use in training data that will be compared with momentum backpropagation with the generation of parameters in a random trial without GA. The process of loading test data, entering testing data that will be used in the testing process. Broadly, the system design plot is depicted in the flowchart in Figure 3.

RESULTS AND DISCUSSION
The results are derived from experiments that have been performed on each genetic parameters. The test parameter is attempted based on the constraints of the parameter value that has been defined. Based on this test, there will be a pair of the best parameter values of each fitness based on the smallest MSE value. There are several tests conducted on the system including testing the learning process of genetic algorithms which includes the influence of genetic parameters on fitness values and comparative analyzes with backpropagation momentum without GA of the resulting accuracy.  Figure 4 (a) is a plot test result of popsize population size on the system on its fitness value. The first experiment was started by taking population size = 10, Pc = 0.5, and Pm = 0.01. The next experiment is by increasing the population size but the other parameter values are fixed.. In this test, the best fitness value 62.5 which can be achieved in experiments with a population size of 80. Figure 4 (b) is a plot graph of the effect of crossover probability change.

The influence of genetic parameters on fitness values
As seen in the test results shows that the best fitness value occurs when the Pc value of 0.7 with a large fitness value of 100.4. The experiments were carried out using the best population size in the previous observations and the initial Pc value. Then for the next PC is determined by the user. In this discussion the influence of Pc value on fitness value will be observed. The best population size that has been obtained previously is 80. The first experiment used was a population size of 80, Pc 0.1, and Pm 0.01 followed by experiments with other Pc values increase constantly. Figure 4 (c) is a plot graph that regulated by using the parameters of the previous observations namely the population of 80 and Pc 0.7. Furthermore, for the next Pm value the user will be determined. The first experiment used a Pm value of 0.01, a population of 80, and a Pc of 0.01. Subsequent experiments will add regular Pm values with a fixed population and Pc.

Comparative Analyzes by Resulting Accuracy
The results obtained in 10 calibration of backpropagation momentum training data testing produce an average accuracy of 76.09%. These results are obtained by randomizing parameters by looking at the possibility of the best parameters in the previous discussion. The highest results in testing using this training data obtained with an accuracy of 84.06%.
In addition using testing data, experiments were conducted using the k-fold cross validation method with k = 3, this method did the test three times in accordance with the number of folds. The test results combined to calculate the value of the confusion matrix and its predictive analysis. Details of the test three times per algorithm using the k-fold cross validation method concluded that the average of GA testing produced higher results than the momentum of backpropagation without GA. The average of GA testing obtained an average value 94.3%, while the momentum of backpropagation testing without GA obtained an average value of 88.22%. Based on the research that has been done, it can be concluded that genetic algorithm as an alternative method of learning momentum backpropagation is able to provide cell prediction results that are closer to the actual value compared to momentum backpropagation with parameters obtained by random trial. This is evidenced by the acquisition of ten times the calibration test results with this pattern with an average memorization accuracy of 83.38% and a three-calibration test for the validation accuracy of 94.30%. These results indicate a higher accuracy compared to ANN without algen optimization with an average memorization accuracy of 76.09% and validation accuracy of 88.22%.
The scheme of combining ANN with Algen in the data used in this study can be an alternative learning that is able to produce ANN hyperparameter without random trial. Hyperparameter optimized in this study includes learning rate, momentum, and the number of hidden layers and each neuron of each hidden layer.