Combination of Coarse-Grained Procedure and Fractal Dimension for Epileptic EEG Classification

Epilepsy, cured by some offered treatments such as medication, surgery, and dietary plan, is a neurological brain disorder due to disturbed nerve cell activity characterized by repeated seizures. Electroencephalographic (EEG) signal processing detects and classifies these seizures as one of the abnormality types in the brain within temporal and spectral content. The proposed method in this paper employed a combination of two feature extractions, namely coarse-grained and fractal dimension, a challenge to obtain a highly accurate procedure to evaluate and predict the epileptic EEG signal of normal, interictal, and seizure classes. The result of classification accuracy using variance fractal dimension (VFD) and quadratic support machine vector (SVM) with a number scale of 10 is 99% as the highest one, excellent performance of the predictive model in terms of the error rate. In addition, a higher scale number does not determine a higher accuracy in this study. Keywords—epilepsy, EEG classification, coarse-grained, fractal dimension, support vector machine ◼ ISSN (print): 1978-1520, ISSN (online): 2460-7258 IJCCS Vol. 15, No. 4, October 2021 : 427 – 438 428


INTRODUCTION
Several types of brain disorders can be experienced by humans, such as dementia, stroke, epilepsy, Alzheimer's, and brain cancer. However, the most common brain disorder that most people suffer from is epilepsy. According to WHO, as many as 50 million people worldwide have epilepsy, so it is not wrong to say it is the most common brain disorder [1]. Epilepsy can affect anyone, from any background, and at any age. One of the characteristics of epilepsy is the presence of seizures that repeatedly occur in certain parts of the body or the whole body, which can also cause fainting and not being able to control bladder function. Several previous studies have classified EEG signals from epilepsy using various feature extraction methods, including the multilevel wavelet-based entropy (MWPE) method with the highest accuracy of 94.3% [2], Katz Fractal with the highest accuracy of 98.7% [3], Higuchi Fractal Dimension (HFD) with the highest accuracy of 97.3% [4], Multidistance Signal Level Difference with the highest accuracy of 97.7% [5]. However, with the complexity of EEG signals from epilepsy, researchers continue to look for the best method for classifying EEG signals from epilepsy, including those carried out in this study.
In addition to feature extraction methods, the classifiers used by previous researchers also vary, including support vector machine (SVM) [3], convolutional neural network [6], and deep learning [7]. Support Vector Machine is a classifier that is often used to detect and predict epilepsy [8] [9]. From studies [8] [9], it is known that SVM can have a sensitivity of more than 90% for predicting epilepsy.
In this study, two feature extractions were used, namely the coarse-grained procedure and the fractal dimension. The coarse-grained procedure is performed to divide the signal into several scales. After being divided into several scales, the next step is to look for feature extraction, namely fractal dimensions. The fractal dimension is crucial because we need a measure for fractal complexity in the discussion of fractals. The measure of the self-similarity of the signal is called the fractal dimension. By using the fractal dimension, it is expected that the size of the self-similarity level of the EEG signal is assumed to be different for each type of EEG signal. The combination of coarse-grained procedures is expected to make signal differences at different scales more prominent. Figure 1 shows the method used in this study. The EEG signal is entered in a coarsegrained procedure to obtain several new signals with different scales. For each signal, the fractal dimension is calculated to characterize the EEG signal. Furthermore, the value of this fractal dimension will be classified by SVM to be assessed as a normal, interictal, or seizure EEG signal. The following subsections will explain the details of each process.

EEG Dataset
At this time, many datasets can be accessed by anyone, one of which is the EEG Dataset from the University of Bonn [10]. Many previous studies [2], [3], [5] have used the EEG dataset. Likewise, in this study, we also used the EEG Dataset from the University of Bonn. The selection of the EEG Dataset includes a large number of data, namely 300 data with a total of 3 classes, namely normal, epilepsy with seizures, and epilepsy without seizures. In addition, the data is free from artifacts noise because it was recorded using a sampling frequency of 173.61 Hz and then filtered using a 40 Hz Low Pass Filter. Epilepsy data were derived from pharmacoresistant focal-onset epilepsy patients undergoing preoperative evaluation. The recorded EEG data from this epilepsy patient was recorded by the Department of Neurology at the University of Bern. Examples of signal data from normal, interictal, and seizure classes are shown in Figure 2.

Coarse-grained Procedure
Physiological signals such as EEG signals have multiple time scales. However, conventional algorithms cannot be used for various time scales. Costa et al. [11] have developed a new algorithm that can calculate multiscale entropy (MSE) for complex time series, which is called the coarse-grained procedure. In general, the concept of the coarse-grained procedure is to use down sampling and smoothing [12]. Equation coarse-grained procedure as in Eq. 1 [13]: Where is a 1-dimensional time series, is a consecutive coarse-grained time series, τ is the scale factor, and N is the length of the original time series. The scale used in this study is 1 to 20. Please note that for a scale factor of 1, it is the same as the raw signal. If τ = 2, then , meanwhile if τ = 3, then and so on. Thus, the coarse-grained procedure is a process of decomposing signals at different scales or levels. Graphically the coarse-grained procedure is as in Figure 3 Figure 3 Coarse-grained procedure

Fractal Dimension
Fractal dimension is a parameter used to define signal complexity. The value represents the level of self-similarity, which is a repeating pattern that occurs in several different signals [14]. The value of the fractal dimension will increase if the emergence of self-similarity signal patterns is increasing. In addition, the higher the fractal dimension value, the more complex the signal being tested [15]. The ability of fractal dimensions to define signal complexity makes it a great tool for modeling biological signals, such as EEG and ECG, which tend to have high complexity and irregularity. In addition, fractal dimensions are very well used to analyze nonlinear behavior and state in chaotic systems such as EEG [16]. There are several kinds of fractal dimensions used in this study, including Box Counting [14] [20].

Box-Counting Method
The box-counting method is one of the methods used to analyze fractals [21]. The technique used by the box-counting way is to use the box as a measure. The box used will cover the figures or curves for which the fractal dimension will be searched. The value of r determines the box size. Meanwhile, the number of boxes that cover figures or curves is represented by N. With the representation of r and N, the total number of boxes used to cover curves is affected by the size of the box. Mathematically, the box-counting method can be written as in Eq. 2 (2) From Equation 2, it can be concluded that the box-counting dimension is obtained from the ratio of to , where r approaches 0. With this statement, it can be said that when the box size comes to 0, then the overall figures or curves will be covered by a box. In simple terms, the number of boxes with different sizes to cover the curves is done by the box-counting method. Calculations with various sizes of this box, of course, require a long calculation time. However, even so, the box-counting method has the advantage that it can be implemented for complex or simple fractals on natural and artificial fractals [22].

Katz Method
Katz method is a fractal dimension calculation method using curve length and planar extent of waveforms [23]. Fractal dimension calculation equation using Katz method for a signal of length N is defined as in Eq. 3 [18].
where is total curve length, and is the maximum distance of two sequential points between the initial point to the maximum distance or the curve diameter. The value of can be defined using Eq. 4.
where is a distance of two sequential points. The value of can be expressed in Eq. 5. (5)

Sevcik Method.
Sevcik Method (SFD) is a method to get the fractal dimension value by using a set of N sample values from a waveform [24], [25]. This method is derived from the Hausdorff Dimension derivative ( ) [25]. The equation for calculating fractal dimensions using the Sevcik Method is shown in Eq. 6 [15].
On Eq. 6, is the total length as expressed in Eq. 4. Then, is the sample value. There is another variation of the Sevcik Method, namely normalization in the x and y-axis before implementing and SFD calculation. Normalization of the x and y-axis causes the topology of the metric space to not change under linear transformation so that all axes are the same [25]. Eq. 7 displays the normalization equation on the x-axis.
where is the initial value in x axis, = , and = .

Variance Method
The variance fractal dimension method (VFD) is a method that calculated s(t) using Hurst Exponent (H), whose derived from the properties of fractions Brownian motion ( ) as expressed in Eq. 9.
where H is signal smoothness, = and ∆t = t2 -t1. The VFD can be determined as Eq. 10.
where E is Euclidean dimension. Because of the Euclidean one-dimensional signal values is 1, then VFD equation can be simplified as Eq. 11.
The value of on the VFD can be different depending on the purpose. If VFD used to separate signal and noise, then the value of is 1. Then, if it is used to separate several data components, then the value of is more than 1.

Support Vector Machine
Support Vector Machine is one of the supervised learning methods in machine learning. The Support Vector Machine works by creating an input-output mapping based on a set of labeled training data. Support Vector Machine can be used for classification and regression. Basically, the Support Vector Machine can separate data by finding the best hyperplane by finding the maximum margin (Error! Reference source not found.). This method can be used for high dimension space. There are several types of classifiers in the Support Vector Machine, including Linear, Quadratic, Cubic, Fine Gaussian, Medium Gaussian, and Coarse Gaussian.

K-Fold Cross-Validation
One way to validate machine learning is to use K-Fold Cross-Validation. K-Fold Cross-Validation is a method for validation by dividing the data set into several parts or folds, depending on the K value. Furthermore, each fold is used as a test set at several points [26]. In this study using K = 5, the data set used in this study will be divided into five parts. After being divided into five parts, the first iteration is carried out where the first part is used to test the model, and the remaining four parts are used to train the model. Then, in the second iteration, we use the second part as a test set and parts 1, 3, 4, and 5 as a training set. These processes are repeated until all parts are used as a test set.

RESULTS AND DISCUSSION
Epilepsy EEG signals in this study were divided into three classes, namely normal (O), interictal (N), and seizure (S). The result of the coarse-grained process is changing the signal variance, not changing the signal's shape. This change is caused by differences in the scale of the coarse-grained process. The relationship between signal variance and scale (τ) is inversely related, i.e., the larger the scale (τ), the smaller the signal variance. Vice versa, the smaller the scale (τ), the greater the signal variance. After the coarse-grained procedure is performed, the fractal dimension calculation process is then carried out. Figure 1-7 shows a graph of the fractal dimension values of 5 kinds of fractal dimensions with a scale of τ = 1-20. The fractal dimension values in the BCFD and HFD methods range from 1-2, but it is more than that for KFD, SVD, and VFD. This is partly due to the lack of KFD, SVD, and VFD ability in calculating the fractal dimension. However, these results were not a problem in this study. Due to the calculation of the fractal dimension in this study, it is used for features only.
Based on the fractal dimension calculation using HFD, KFD, SFD, VFD, the graph of the fractal dimension value to the scale has a similar pattern. Namely, the fractal dimension value will increase to a specific scale then the fractal dimension value will stagnate. However, in contrast to the fractal dimension calculation using BCFD, the fractal dimension value tends to be high when the scale is small. Still, the value decreases in comparison to the increasing scale. As the results of BCFD calculations, these results are used to classify lung sounds [27].
The fractal dimension value of the epilepsy class with seizures (S) tends to always be higher than normal class (O) and interictal (N). Meanwhile, the normal class (O) in several fractal dimensions, namely BCFD and KFD, has the lowest value. However, initially, at SVD and VFD, the normal class (O) had the lowest score, but after passing the scale of 10, it began to rise and became higher than the interictal (N). These results show that although seizure (S), normal (O), and interictal (N) have similar patterns, they have different features of fractal dimensions. This is what can be used to separate the three classes using a classifier, namely SVM. Accuracy using five kinds of fractal dimensions, namely VFD, SFD, BCFD, KFD, and HFD with six kinds of SVM classifier namely Linear, Quadratic, Cubic, Fine Gaussian, Medium Gaussian, and Coarse Gaussian SVM, shown in Table 1 and Table 2 The highest accuracy is 99% by using a VFD with a scale of 10 and a quadratic SVM classifier. In contrast, the lowest accuracy is 60.7% on fractal dimensions BCFD with a scale of 20 and the Fine Gaussian SVM classifier type. When viewed from various classifiers and their scales, the best accuracy is using fractal dimension VFD because almost all accuracy is more than 98%. Only when using the Medium Gaussian SVM and Coarse Gaussian SVM classifiers with a scale of 10 have accuracy below 98%, namely each -94.7% and 87.7%, respectively. Meanwhile, the worst accuracy is using BCFD, and the accuracy is between 60.7% to 74.3%.
When examined for each fractal dimension method, one method with another method does not always have the same accuracy at the same scale and classifier. In BCFD, the best accuracy occurs on a scale of 10 with the Fine Gaussian SVM classifier, with a percentage of 74.30%. Then, on HFD, the best accuracy is on a scale of 10 with the Quadratic SVM and Cubic

435
SVM classifiers, with a percentage of 95.70%. Then, on KFD, the best accuracy is on a scale of 20 with the Medium Gaussian SVM classifier, with a percentage of 88.30%. In SFD, the best accuracy is 98.30%, on a scale of 20, and the Classifier Linear SVM. Finally, on the VFD, the best accuracy is 99.00%, obtained on a scale of 10, and the Quadratic SVM classifier. From the calculation of epilepsy EEG accuracy using a combination of coarse-grained and fractal dimensions, it can be seen that the higher the scale size does not always represent high accuracy. It is proven that the highest accuracy is on a scale of 10 (using VFD and Quadratic SVM), not on a scale of 20.   Coarse-grained procedure and FD 10 99% Table 3 shows a comparison of previous studies that used fractal dimensions and SVM as a classifier. KFD calculated on the Alpha, Betha, Delta, Theta, and Gamma sub bands of EEG signals yielded the highest accuracy of 98.7% [3]. Meanwhile, the calculated HFD at various time-interval resolutions produces an accuracy of 98% [4]. In the study by Silalahi et al., MSLD was used to split the ECG and FD signals as a feature resulting in 99% accuracy. The difference with this research is in the signal splitting process using MSLD with the same FD calculation. The accuracy of both is the same, which is achieved with the same number of features. The proposed method produces accuracy that is proven to have high accuracy compared to previous studies The weakness of the proposed method is the determination of the scale on the coarsegrained procedure is done by trial and error. However, this can be overcome by looking at the resulting characteristics, as shown in Figure 5 -Figure 9. Seen at a scale > 10, the differences between data classes become smaller. The advantages of this method are simple computation and the possibility to perform feature selection to increase accuracy. The use of feature subset selection to select features at a specific scale gives the possibility of increasing accuracy. The use of this method on other biomedical signals is interesting for further research.

CONCLUSION
In this study, classifying epileptic EEG signals is proposed using a coarse-grained procedure and fractal dimensions. The coarse-grained procedure process will produce several signals with different scales, then calculate the fractal dimensions. A coarse-grained approach helps break signals in various scales to see the change of signal from its coarse form to a more acceptable form than the process of averaging several signal scales. Meanwhile, the fractal dimension helps calculate the complexity of the EEG signal. Classification using SVM produces the highest accuracy of 99% using VFD. A larger scale that reflects a more significant number of traits does not result in higher accuracy. The proof is the highest accuracy is achieved when using ten scales, not when using 20 scales. The proposed method produces a reasonably high accuracy compared to other methods that use fractal dimensions and SVM. The proposed method has the potential for use in other biomedical signals.