Maria Arista Ulfa(1*), Selo Sulistyo(2), Muslikhin Hidayat(3)

(1) Universitas Gadjah Mada
(2) Universitas Gadjah Mada
(3) Universitas Gadjah Mada
(*) Corresponding Author


The business world has experienced a paradigm shift towards a more modern concept. Many business processes are carried out through the internet or commonly known as e-commerce, by utilizing a platform known as Marketplace. One of the marketplaces that are quite well-known and in great demand in Indonesia is Shopee. The high online shopping activity in the current marketplace indirectly encourages business actors to understand the online market. However, one of the obstacles that are quite often faced by sellers, especially new sellers who are starting to enter the digital realm, is the emergence of confusion in the selection of products to be sold due to a lack of information regarding the demand for what products are in demand in the market.

The process of searching for information related to the demand for products of interest is carried out through clustering analysis to find out the groups of products that are of interest to those that are less attractive to the public. The data used is product data from 6 categories in the Shopee market which was taken using web scraping techniques. The clustering processes used the K-means approach by determining the number of K and the optimal center point through the calculation of Sum Square Error (SSE) by looking at the elbow graph. The final results show the optimal number of K clusters that are different in each category, namely in category women’s clothing, men’s clothing, and electronics are at K=4 then for products in the category of Muslim fashion, care & beauty and household appliances are at K=3. Based on the validation results using the Davies Bouldin Index, values were obtained in6 categories, namely 0.391, 0.438, 0.414, 0.357, 0.387, and 0.377, which means that the cluster structure and the level of information formed in each category using the K-Means method is quite good.


Shopee Product; Web Scraping; K-Means Clustering

Full Text:



We Are Social And Hootsuite, accessed December 27, 2020, Digital 2020 Indonesia : All The Data, Trends, And Insights You Need To Help You Understand How People Use The Internet, Mobile, Social Media, And Ecommerce,

Google, Temasek and Bain Company, accessed December 29, 2020, E-conomy SEA 2019 Report, subscribe/google-temasek-e-conomy-sea-2019

Yustiani, R. dan Yunanto, R. 2017. Peran Marketplace Sebagai Alternatif Bisnis di Era Teknologi Informasi, Jurnal Ilmiah Komputer dan Informatika Vol. 6 No.2 : 2089-9033.

iPrice, accessed December 30, 2020, Peta E-Commerce Indonesia,

Budiharto, W., 2018, Pemrograman Python untuk Ilmu Komputer dan Teknik, Andi Offset : Yogyakarta.

Thomas, D.M. and Mathur, S., 2019, Data Analysis by Web Scraping Using Python, Proceedings of the Third International conference on Electronics, Communication and Aerospace Technology (ICECA) : 10.1109/ICECA.2019.8822022.

Vulandari, R.T., 2017, Data Mining : Teori dan Aplikasi Rapid Miner, Yogyakarta : Gava Media.

Kantardzic, M., 2019, Data Mining Concepts, Models, Methods and Algorithms (Third Edition), IEEE Press : United States of America.

Primandana, A., Adinugroho, S. dan Dewi, C., 2019, Optimasi Penentuan Centroid pada Algoritme K-Means Menggunakan Algoritme Pillar (Studi Kasus: Penyandang Masalah Kesejahteraan Sosial di Provinsi Jawa Timur), Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer Vol.3 No.11 : 2548-964X.

Maheswari, K., 2019, Finding Best Possible Number Of Clusters using K-Means Algorithm, International Journal of Engineering and Advanced Technology (IJEAT) : 10.35940.A1119.1291S419

Nainggolan, R. and Lumbantoruan, G., 2018, Optimasi Performa Cluster K-Means Menggunakan Sum of Squared Error (SSE), Jurnal Manajemen Informatika dan Komputerisasi Akuntansi Vo.2 No.2.

Singh., A.K., Mittal, S., Srivastava, Y.V. and Malhotra, P., 2020, Clustering Evaluation by Davies Bouldin Index (DBI) In Cereal Data Using K-Means Clustering, Proceedings of the Fourth International conference on Computing Methodologies and Communication (ICCMC) : 10.1109/iccmc48092.2020.iccmc-00057.

Xiao, J., Lu, J. and Li, X., 2017, Davies Bouldin Index based Hierarchical Initialization K-means. Intelligent Data Analysis : 10.3233.


Article Metrics

Abstract views : 1362 | views : 2281


  • There are currently no refbacks.

Copyright (c) 2021 ASEAN Journal of Systems Engineering

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

ASEAN Journal of Systems Engineering (AJSE) 
P-ISSN: 2338-2309 || E-ISSN: 2338-2295
Master in Systems Engineering
Faculty of Engineering
Universitas Gadjah Mada
Jl. Teknika Utara No.3, Barek, Yogyakarta, Indonesia 55281 
Email: |