SMOTE-SVM for Handling Imbalanced Data in Obesity Classification

https://doi.org/10.22146/ijccs.103994

Muhammad Kunta Biddinika(1*), Herman Yuliansyah(2), Dewi Soyusiawaty(3), Farhan Radhiansyah Razak(4)

(1) Master Program of Informatics, Universitas Ahmad Dahlan, Yogyakarta, Indonesia
(2) Department of Informatics, Universitas Ahmad Dahlan,Yogyakarta, Indonesia
(3) Department of Informatics, Universitas Ahmad Dahlan,Yogyakarta, Indonesia
(4) Master Program of Informatics, Universitas Ahmad Dahlan, Yogyakarta, Indonesia
(*) Corresponding Author

Abstract


 Obesity is a significant health issue associated with various chronic diseases, making its early classification critical for effective interventions. This study investigates the performance of Support Vector Machine (SVM) models with Radial Basis Function (RBF) and Linear kernels on imbalanced obesity datasets. To address data imbalance, Synthetic Minority Over-sampling Technique (SMOTE) and Random Undersampling (RUS) were applied. The results reveal that balancing techniques significantly enhance classification performance, with the Linear model achieving the highest accuracy of 96.54% when balanced using SMOTE. However, limitations include reduced recall for minority classes and potential overfitting risks. These findings underscore the importance of balancing techniques in health data classification and offer insights for further optimizing model performance. The study highlights the need for advanced data balancing strategies to improve predictive accuracy and equity across all classes.


Keywords


Obesity; SMOTE; RUS; RBF; Linear

Full Text:

PDF


References

T. Omer, “The causes of obesity: an in-depth review,” Adv Obes Weight Manag Control, vol. 10, no. 4, pp. 90–94, Jul. 2020, doi: 10.15406/aowmc.2020.10.00312.

D. Uğurlu, H. Yapıcı, R. Ünver, and M. Gülü, “Comparison of obesity and physical activity levels of adult individuals by examining dietary habits with different parameters,” Journal of Health Sciences and Medicine, vol. 7, no. 3, pp. 301–307, May 2024, doi: 10.32322/jhsm.1450444.

A. De Lorenzo, S. Gratteri, P. Gualtieri, A. Cammarano, P. Bertucci, and L. Di Renzo, “Why primary obesity is a disease?,” J Transl Med, vol. 17, no. 1, May 2019, doi: 10.1186/s12967-019-1919-y.

Matthias Blüher, “Obesity: global epidemiology and pathogenesis,” Nat Rev Endocrinol, vol. 15, no. 5, pp. 288–298, 2019, doi: 10.1038/s41574-019- 0176-8.

Y. Wang, M. A. Beydoun, J. Min, H. Xue, L. A. Kaminsky, and L. J. Cheskin, “Has the prevalence of overweight, obesity and central obesity levelled off in the United States? Trends, patterns, disparities, and future projections for the obesity epidemic,” Int J Epidemiol, vol. 49, no. 3, pp. 810–823, 2021, doi: 10.1093/IJE/DYZ273.

M. Safaei, E. A. Sundararajan, M. Driss, W. Boulila, and A. Shapi’i, “A systematic literature review on obesity: Understanding the causes & consequences of obesity and reviewing various machine learning approaches used to predict obesity,” Sep. 01, 2021, Elsevier Ltd. doi: 10.1016/j.compbiomed.2021.104754.

L. N. Ferreira, L. N. Pereira, M. da Fé Brás, and K. Ilchuk, “Quality of life under the COVID-19 quarantine,” Quality of Life Research, vol. 30, no. 5, pp. 1389–1405, May 2021, doi: 10.1007/s11136-020-02724-x.

E. Verdú, J. Homs, and P. Boadas-Vaello, “Physiological changes and pathological pain associated with sedentary lifestyle-induced body systems fat accumulation and their modulation by physical exercise,” Dec. 01, 2021, MDPI. doi: 10.3390/ijerph182413333.

D. Mohajan and H. K. Mohajan, “Obesity and Its Related Diseases: A New Escalating Alarming in Global Health,” Journal of Innovations in Medical Research, vol. 2, no. 3, pp. 12–23, Mar. 2023, doi: 10.56397/jimr/2023.03.04.

E. A. Silveira, R. R. da S. Filho, M. C. B. Spexoto, F. Haghighatdoost, N. Sarrafzadegan, and C. de Oliveira, “The role of sarcopenic obesity in cancer and cardiovascular disease: A synthesis of the evidence on pathophysiological aspects and clinical implications,” May 01, 2021, MDPI. doi: 10.3390/ijms22094339.

D. Ryan, S. Barquera, O. Barata Cavalcanti, and J. Ralston, “The global pandemic of overweight and obesity: addressing a twenty- first century multifactorial disease,” In: Haring R, Kickbusch I, Ganten D, Moeti M, eds. Handbook of Global Health. Springer International Publishing:, pp. 739–773, 2021, doi: 10.1007/978-3-030-05325-3_39-1.

M. A. B. Khan, M. J. Hashim, J. K. King, R. D. Govender, H. Mustafa, and J. Al Kaabi, “Epidemiology of Type 2 diabetes - Global burden of disease and forecasted trends,” J Epidemiol Glob Health, vol. 10, no. 1, pp. 107–111, Mar. 2020, doi: 10.2991/JEGH.K.191028.001.

Q. An, S. Rahman, J. Zhou, and J. J. Kang, “A Comprehensive Review on Machine Learning in Healthcare Industry: Classification, Restrictions, Opportunities and Challenges,” May 01, 2023, MDPI. doi: 10.3390/s23094178.

M. A. Al-Hashem, A. M. Alqudah, and Q. Qananwah, “Performance Evaluation of Different Machine Learning Classification Algorithms for Disease Diagnosis,” International Journal of E-Health and Medical Communications, vol. 12, no. 6, 2021, doi: 10.4018/IJEHMC.20211101.oa5.

A. I. Putri et al., “Implementation of K-Nearest Neighbors, Naïve Bayes Classifier, Support Vector Machine and Decision Tree Algorithms for Obesity Risk Prediction,” Public Research Journal of Engineering, Data Technology and Computer Science, vol. 2, no. 1, pp. 26–33, Apr. 2024, doi: 10.57152/predatecs.v2i1.1110.

S. A. Thamrin, D. S. Arsyad, H. Kuswanto, A. Lawi, and S. Nasir, “Predicting Obesity in Adults Using Machine Learning Techniques: An Analysis of Indonesian Basic Health Research 2018,” Front Nutr, vol. 8, Jun. 2021, doi: 10.3389/fnut.2021.669155.

X. Cheng et al., “Does physical activity predict obesity—a machine learning and statistical method-based analysis,” Int J Environ Res Public Health, vol. 18, no. 8, Apr. 2021, doi: 10.3390/ijerph18083966.

B. Bonnechère, A. Cuevas-Sierra, J. Jeon, S. Lee, and C. Oh, “Age-specific risk factors for the prediction of obesity using a machine learning approach,” Front Public Health, vol. 10, 2023, doi: https://doi.org/10.3389/fpubh.2022.998782.

E. Carlos et al., “Machine learning Techniques to Predict Overweight or Obesity,” In CEUR Workshop Proceedings, vol. 3038, pp. 190–204, 924.

M. Aldraimli et al., “Machine learning prediction of susceptibility to visceral fat associated diseases,” Health Technol (Berl), vol. 10, no. 4, pp. 925–944, Jul. 2020, doi: 10.1007/s12553-020-00446-1.

T. Kosolwattana, C. Liu, R. Hu, S. Han, H. Chen, and Y. Lin, “A self-inspected adaptive SMOTE algorithm (SASMOTE) for highly imbalanced data classification in healthcare,” BioData Min, vol. 16, no. 1, Dec. 2023, doi: 10.1186/s13040-023-00330-4.

S. Sreejith, H. Khanna Nehemiah, and A. Kannan, “Clinical data classification using an enhanced SMOTE and chaotic evolutionary feature selection,” Comput Biol Med, vol. 126, 2020, doi: https://doi.org/10.1016/j.compbiomed.2020.103991.

E. Ismail, W. Gad, and M. Hashem, “A hybrid Stacking-SMOTE model for optimizing the prediction of autistic genes,” BMC Bioinformatics, vol. 24, no. 1, Dec. 2023, doi: 10.1186/s12859-023-05501-y.

Wang Lu, “Imbalanced credit risk prediction based on SMOTE and multi-kernel FCM improved by particle swarm optimization,” Appl Soft Comput, vol. 114, 2022, doi: 10.1016/j.asoc.2021.108153.

P. C. Y. Cheah, Y. Yang, and B. G. Lee, “Enhancing Financial Fraud Detection through Addressing Class Imbalance Using Hybrid SMOTE-GAN Techniques,” International Journal of Financial Studies, vol. 11, no. 3, Sep. 2023, doi: 10.3390/ijfs11030110.

A. Özdemir, K. Polat, and A. Alhudhaif, “Classification of imbalanced hyperspectral images using SMOTE-based deep learning methods,” Expert Syst Appl, vol. 178, 2021, doi: https://doi.org/10.1016/j.eswa.2021.114986.

E. Chamseddine, N. Mansouri, M. Soui, and M. Abed, “Handling class imbalance in COVID-19 chest X-ray images classification: Using SMOTE and weighted loss,” Appl Soft Comput, vol. 129, Nov. 2022, doi: 10.1016/j.asoc.2022.109588.

M. C. Untoro and M. A. N. M. Yusuf, “Evaluate of Random Undersampling Method and Majority Weighted Minority Oversampling Technique in Resolve Imabalanced Dataset,” IT Journal Research and Development, vol. 8, no. 1, pp. 1–13, Aug. 2023, doi: 10.25299/itjrd.2023.12412.

S. M. Malakouti, M. B. Menhaj, and A. A. Suratgar, “The usage of 10-fold cross-validation and grid search to enhance ML methods performance in solar farm power generation prediction,” Clean Eng Technol, vol. 15, Aug. 2023, doi: 10.1016/j.clet.2023.100664.

M. Alida and M. Mustikasari, “Rupiah Exchange Prediction of US Dollar Using Linear, Polynomial, and Radial Basis Function Kernel in Support Vector Regression,” Jurnal Online Informatika, vol. 5, no. 1, pp. 53–60, 2020, doi: 10.15575/join.

R. Mukarramah, D. Atmajaya, and L. B. Ilmawan, “Performance comparison of support vector machine (SVM) with linear kernel and polynomial kernel for multiclass sentiment analysis on twitter,” ILKOM Jurnal Ilmiah, vol. 13, no. 2, pp. 168–174, Aug. 2021, doi: 10.33096/ilkom.v13i2.851.168-174.

F. Radhiansyah Razak, M. Kunta Biddinika, and H. Yuliasnyah, “Radial Basis Function Model for Obesity Classification Based on Lifestyle and Physical Condition,” Teknologi Informasi dan Komputer, vol. 192, no. 2, 2024, doi: 10.31961/eltikom.v8i2.1347.



DOI: https://doi.org/10.22146/ijccs.103994

Article Metrics

Abstract views : 324 | views : 36

Refbacks

  • There are currently no refbacks.




Copyright (c) 2025 IJCCS (Indonesian Journal of Computing and Cybernetics Systems)

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.



Copyright of :
IJCCS (Indonesian Journal of Computing and Cybernetics Systems)
ISSN 1978-1520 (print); ISSN 2460-7258 (online)
is a scientific journal the results of Computing
and Cybernetics Systems
A publication of IndoCEISS.
Gedung S1 Ruang 416 FMIPA UGM, Sekip Utara, Yogyakarta 55281
Fax: +62274 555133
email:ijccs.mipa@ugm.ac.id | http://jurnal.ugm.ac.id/ijccs



View My Stats1
View My Stats2