Klasifikasi Data Microarray Menggunakan Discrete Wavelet Transform dan Extreme Learning Machine
Khadijah Khadijah(1*), Sri Hartati(2)
(1) 
(2) Jurusan Ilmu Komputer dan Elektronika, FMIPA, UGM, Yogyakarta
(*) Corresponding Author
Abstract
Abstrak
Data microarray digunakan sebagai alternatif untuk diagnosa penyakit kanker karena kesulitan dalam dignosa kanker berdasarkan bentuk morfologis, yaitu perbedaan morfologis yang tipis antar jenis kanker yang berbeda. Penelitian ini bertujuan untuk membangun pengklasifikasi data microarray. Proses klasifikasi diawali dengan reduksi dimensi data microarray menggunakan DWT, dengan cara mendekomposisi sampel hingga level tertentu, kemudian mengambil nilai koefisien aproksimasi pada level tersebut sebagai fitur sampel. Fitur tersebut selanjutnya menjadi masukan untuk klasifikasi. Metode klasifikasi yang digunakan adalah ELM yang diterapkan pada RBFN. Dataset yang digunakan adalah data microarray multikelas, yaitu dataset GCM (16.063 gen, 14 kelas) dan Subtypes-Leukemia (12.600 gen, 7 kelas).
Pengujian dilakukan dengan cara membagi data latih dan data uji secara random sepuluh kali dengan proporsi data yang sama. Classifier yang dihasilkan dari penelitian ini untuk dataset GCM belum memiliki performa yang cukup baik, ditunjukkan dengan nilai akurasi sekitar 75% ± 6,25% dan nilai minimum sensitivity yang masih rendah, yaitu 15% ± 19,95% menunjukkan bahwa sensitivity untuk tiap kelas belum merata, terdapat beberapa kelas yang sensitivity-nya masih rendah. Namun, classifier untuk dataset Subtypes-Leukemia yang memiliki jumlah kelas lebih sedikit dari dataset GCM memiliki performa yang cukup baik, ditunjukkan dengan nilai akurasi 87,68% ± 2,88% dan minimum sensitivity 51,90% ± 20,29%.
Kata kunci— microarray, ekspresi gen, DWT, ELM, RBFN
Abstract
Microarray data is used as an alternative in cancer diagnosis because of the difficulties cancer diagnosis based on morphologis structures. Different classes of cancer usually have poor distintion of morphologis structures. The aim of this reserach is to bulid microarray data classfier. The classification process is started by reducing dimension of microarray data. The method used to reduce the microarray data dimension is DWT by decomposing the samples until certain decomposition level and then use approximation coefficients at those level as feature to classifier. Classifier used in this reserach is ELM implemeted on RBFN. Dataset used are GCM (16.063 genes, 14 classes) and Subtypes-Leukemia (12.600 genes, 7 classes).
Testing process is done by randomly dividing the training and testing data ten times with same proprotion of training and testing data. The perfomance of classifier built in this research is not so good for GCM dataset, shown by accuracy 75% ± 6,25% and mean of minimum sensitivity 15% ± 19,95%. The low minimum sensitivity indicate that there are few classes that have low sensitivity. But the classifier for Subtypes-Leukemia dataset give better result, that is accuracy 87,68% ± 2,88% and mean of minimum sensitivity 51,90% ± 20,29%.
Keywords— microarray, gene expression, DWT, ELM, RBFN
Keywords
Full Text:
PDFReferences
[1] Ramaswamy, S., Tamayo, P., Rifkin, R., Mukherjee, S., Yeang, C.-H., Angelo, M., Ladd, C., Reich, M., Latulippe, E., Mesirov, J.P., Poggio, T., Gerald, W., Loda, M., Lander, E.S. and Golub, T.R., 2001, Multiclass Cancer Diagnosis Using Tumor Gene-expression Signatures, Proceedings of National Academy Sciences (PNAS), USA, 98, 26, 15149-15154.
[2] Ghanem, M., 2004, Course 341 Introduction to Bioinformatics Microarrays1: Microarray Technology, Lecture Notes, Department of Computing, Imperial College, London.
[3] Stekel, D., 2003, Microarray Bioinformatics, Cambridge University Press, New York.
[4] Quackenbush, J., 2006, Microarray Analysis and Tumor Classification, New England J. Medicine, 354, 23, 2463–2472.
[5] Wang, H., Zhang, H., Dai, Z., Chen, M. dan Yuan, Z., 2013, TSG: A New Algorithm for Binary and Multi-class Cancer Classification and Informative Genes Selection, BMC Medical Genomics 2013, 6(Suppl 1):S3.
[6] Lio, P., 2003, Wavelets in Bioinformatics and Computational Biology: State of Art and Perspectives, Bioinformatics Review, 19, 1, 2–9.
[7] Fugal, D. L., 2009, Conceptual Wavelets in Digital Signal Processing, Space and Sinyal Technical Publishing, Sandiego, California.
[8] Misiti, M., Misiti, Y., Oppenheim, G. dan Poggi, J-M., 2012, Wavelet TollboxTM User’s Guide R2012b, The MathWork Inc., Natick.
[9] Li, S., Liao, C. and Kwok, J.T., 2006, Wavelet-Based Feature Extraction for Microarray Data Classification, Int. Joint Conference on Neural Networks, Vancouver, BC, Canada, 16-21 Juli, 5028–5033.
[10] Liu, Y., 2008, Detect Key Gene Information in Classification of Microarray Data, EURASIP J. Advances in Signal Processing, 2008, 1-10.
[11] Rashid, S. and Maruf, G.M, 2011, An Adaptive Feature Reduction Algorithm for Cancer Classification Using Wavelet Decomposition of Serum Proteomic and DNA Microarray Data, IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBM11), Atlanta, Georgia, USA, 12-15 November, 305-312.
[12] Huynh, H.T., Kim, J.J. dan Won, Y., 2007, DNA Microarray Classification with Compact Single Hidden-Layer FeedForward Neural Networks, Proceedings of Frontiers in the Convergence of Bioscience and Information Technologies, Cheju Island, Korea, 11-13 Oktober, 193-198.
[13] Zhang, S.W., Huang, D.S. and Wang, S.L., 2010, A Method of Tumor Classification Based on Wavelet Packet Transforms and Neighborhood Rough Set, Computers in Biology and Medicine, 40, 430–437.
[14] Stanikov, A., Aliferis, C.F., Tsamardinos, I., Hardin, D. dan Levy, S., 2005, A Comprehensive Evaluation of Multicategory Classification Methods for Microarray Gene Expression Cancer Diagnosis, Bioinformatics, 21, 5, 631-643.
[15] Liu, Y., 2009, Wavelet Feature Extraction for High-Dimensional Microarray Data, Neurocomputing, 72, 985–990.
[16] Zhang, R., Huang, G.B., Sundararajan, N. and Saratchandran, P., 2007, Multicategory Classification Using an Extreme Learning Machine for Microarray Gene-expression Cancer Diagnosis, IEEE/ACM Transaction on Computational Biology and Bioinformatics, 4, 3, 485-495.
[17] Huang, G.B., Zhu, Q.Y. and Siew, C.K., 2004, Extreme Learning Machine: A New Learning Scheme of Feedforward Neural Networks, Proceedings of International Joint Conference on Neural Networks (IJCNN2004), Budapest, Hungary, 25–29 Juli.
[18] Monedero, J.S., Ramirez, M.C., Navarro, F.F., Fernandez, J.C., Gutierrez, P.A. and Martinez, C.H., 2010, On the Suitability of Extreme Learning Machine for Gene Classification Using Feature Selection, Proceedings of International Conference on Intelligent Systems Design and Applications (ISDA), Cairo, 20 Nov – 1 Des, 507-512.
[19] Huang, G.B, Zhou, H., Ding, X. and Zhang, R., 2012, Extreme Learning Machine for Regression and Multiclass Classification, IEEE Transaction on System, Man, and Cybernetics – Part B: Cybernetics, 42, 2, 513–529.
[20] Yeoh, E.J., Ross, M.E., Shurtleff, S.A., Williams, W.K., Patel, D., Mahfouz, R., Behm, F.G., Raimondi, S.C., Relling, M.V., Patel, A., Cheng, C., Campana, D., Wilkins, D., Zhou, X., Li, J., Liu, H., Pui, C.H., Evans, W.E., Naeve, C., Wong, L. and Downing, J.R., 2002, Classification, Subtype Discovery, and Prediction of Outcome in Pediatric Acute Lymphoblastic Leukemia by Gene Expression Profiling, Cancer Cell, 1, 133–143.
DOI: https://doi.org/10.22146/ijccs.6638
Article Metrics
Abstract views : 3454 | views : 2896Refbacks
- There are currently no refbacks.
Copyright (c) 2015 IJCCS - Indonesian Journal of Computing and Cybernetics Systems
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
View My Stats1