Optimizing Clustering Models Using Principle Component Analysis for Car Customers

https://doi.org/10.22146/ijccs.94744

Agnes Riska Savira(1*)

(1) University of Buana Perjuangan Karawang, Indonesia
(*) Corresponding Author

Abstract


 In the competitive business world, companies strategically utilize customer data to achieve goals, requiring a comprehensive understanding of various customer traits, behaviors and needs. Customer segmentation, an important strategy, requires grouping individuals based on various characteristics. The K-Means algorithm is widely used for customer data grouping connectivity because of its ease of implementation in Machine Learning. However, challenges arise in high-dimensional data, prompting the need for dimensionality reduction. Principal Component Analysis (PCA) is emerging as an effective method for data communication while minimizing information loss. Previous research emphasizes the success of PCA in improving analysis and clustering efficiency. This research contributes by integrating PCA into K-Means clustering to analyze customer segments in a car company. This empowers companies to attract new customers, implement targeted marketing, understand customer-company relationships, and increase expected profitability. PCA, which preserves 75% of the variation with 3 principal components, precedes the implementation of K-Means after normalization. Evaluation using the Elbow and Silhouette Score Method identified eight optimal clusters. The post-PCA K-Means model with optimal cluster selection produces a Silhouette Score of 0.7789.

 


Keywords


K-Means, PCA, Customer Segmentations, Machine Learning

Full Text:

PDF


References

N. H. Harani, C. Prianto, and F. A. Nugraha, “Segmentasi Pelanggan Produk Digital Service Indihome Menggunakan Algoritma K-Means Berbasis Python,” J. Manaj. Inform., vol. 10, no. 2, pp. 133–146, 2020, doi: 10.34010/jamika.v10i2.2683.

A. T. Widiyanto and A. Witanti, “Segmentasi Pelanggan Berdasarkan Analisis RFM Menggunakan Algoritma K-Means Sebagai Dasar Strategi Pemasaran (Studi Kasus PT Coversuper Indonesia Global),” KONSTELASI Konvergensi Teknol. dan Sist. Inf., vol. 1, no. 1, pp. 204–215, 2021, doi: 10.24002/konstelasi.v1i1.4293.

A. Abdulhafedh, “Incorporating K-means, Hierarchical Clustering and PCA in Customer Segmentation,” J. City Dev., vol. 3, no. 1, pp. 12–30, 2021, doi: 10.12691/jcd-3-1-3.

D. Hediyati and I. M. Suartana, “Penerapan Principal Component Analysis (PCA) Untuk Reduksi Dimensi Pada Proses Clustering Data Produksi Pertanian Di Kabupaten Bojonegoro,” J. Inf. Eng. Educ. Technol., vol. 5, no. 2, pp. 49–54, 2021, doi: 10.26740/jieet.v5n2.p49-54.

M. Harahap, Y. Lubis, and Z. Situmorang, “Analisis Pemasaran Bisnis dengan Data Science : Segmentasi Kepribadian Pelanggan berdasarkan Algoritma K-Means Clustering,” Data Sci. Indones., vol. 1, no. 2, pp. 76–88, 2022, doi: 10.47709/dsi.v1i2.1348.

S. Dwididanti and D. A. Anggoro, “Analisis Perbandingan Algoritma Bisecting K-Means dan Fuzzy C-Means pada Data Pengguna Kartu Kredit,” Emit. J. Tek. Elektro, vol. 22, no. 2, pp. 110–117, 2022, doi: 10.23917/emitor.v22i2.15677.

N. Khairu Nissa, Y. Nugraha, C. F. Finola, A. Ernesto, J. I. Kanggrawan, and A. L. Suherman, “Evaluasi Berbasis Data: Kebijakan Pembatasan Mobilitas Publik dalam Mitigasi Persebaran COVID-19 di Jakarta,” J. Sist. Cerdas, vol. 3, no. 2, pp. 84–94, 2020, doi: 10.37396/jsc.v3i2.77.

N. Y. Aswad, “Clustering Algoritma K-Means Pengadaan Barang Non Medis Di Rumah Sakit Jantung Hasna Medika Cirebon,” J. Data Sci. dan Inform., vol. 2, no. 1, pp. 6–14, 2022.

A. Yudhistira and R. Andika, “Pengelompokan Data Nilai Siswa Menggunakan Metode K-Means Clustering,” J. Artif. Intell. Technol. Inf., vol. 1, no. 1, pp. 20–28, 2023, doi: 10.58602/jaiti.v1i1.22.

T. Tommy and A. M. Husein, “Model Prediksi Prestasi Mahasiswa Berdasarkan Evaluasi Pembelajaran Menggunakan Pendekatan Data Science,” Data Sci. Indones., vol. 1, no. 1, pp. 14–20, 2021, doi: 10.47709/dsi.v1i1.1168.

T. F. Johnson, N. J. B. Isaac, A. Paviolo, and M. González-Suárez, “Handling missing values in trait data,” Glob. Ecol. Biogeogr., vol. 30, no. 1, pp. 51–62, 2021, doi: 10.1111/geb.13185.

P. Arsi, R. Wahyudi, and R. Waluyo, “Optimasi SVM Berbasis PSO pada Analisis Sentimen Wacana Pindah Ibu Kota Indonesia,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 5, no. 2, pp. 231–237, 2021, doi: 10.29207/resti.v5i2.2698.

T. Nyitrai and M. Virág, “The effects of handling outliers on the performance of bankruptcy prediction models,” Socioecon. Plann. Sci., vol. 67, no. August, pp. 34–42, 2019, doi: 10.1016/j.seps.2018.08.004.

E. P. Cynthia and E. Ismanto, “Metode Decision Tree Algoritma C.45 Dalam Mengklasifikasi Data Penjualan Bisnis Gerai Makanan Cepat Saji,” Jurasik (Jurnal Ris. Sist. Inf. dan Tek. Inform., vol. 3, no. July, p. 1, 2018, doi: 10.30645/jurasik.v3i0.60.

A. S. Ritonga and I. Muhandhis, “Teknik Data Mining Untuk Mengklasifikasikan Data Ulasan Destinasi Wisata Menggunakan Reduksi Data Principal Component Analysis (Pca),” Edutic - Sci. J. Informatics Educ., vol. 7, no. 2, 2021, doi: 10.21107/edutic.v7i2.9247.

A. Sulistiyawati and E. Supriyanto, “Implementasi Algoritma K-means Clustring dalam Penetuan Siswa Kelas Unggulan,” J. Tekno Kompak, vol. 15, no. 2, p. 25, 2021, doi: 10.33365/jtk.v15i2.1162.

K. D. Ramgude and N. R. Rajhans, “K-means clustering for optimization of spare parts delivery,” Manag. Sci. Lett., vol. 13, no. 4, pp. 235–240, 2023, doi: 10.5267/j.msl.2023.6.004.

K. R. Shahapure and C. Nicholas, “Cluster quality analysis using silhouette score,” Proc. - 2020 IEEE 7th Int. Conf. Data Sci. Adv. Anal. DSAA 2020, pp. 747–748, 2020, doi: 10.1109/DSAA49011.2020.00096.



DOI: https://doi.org/10.22146/ijccs.94744

Article Metrics

Abstract views : 1800 | views : 1297

Refbacks

  • There are currently no refbacks.




Copyright (c) 2024 IJCCS (Indonesian Journal of Computing and Cybernetics Systems)

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.



Copyright of :
IJCCS (Indonesian Journal of Computing and Cybernetics Systems)
ISSN 1978-1520 (print); ISSN 2460-7258 (online)
is a scientific journal the results of Computing
and Cybernetics Systems
A publication of IndoCEISS.
Gedung S1 Ruang 416 FMIPA UGM, Sekip Utara, Yogyakarta 55281
Fax: +62274 555133
email:ijccs.mipa@ugm.ac.id | http://jurnal.ugm.ac.id/ijccs



View My Stats1
View My Stats2