Social-Child-Case Document Clustering based on Topic Modeling using Latent Dirichlet Allocation
Nur Annisa Tresnasari(1*), Teguh Bharata Adji(2), Adhistya Erna Permanasari(3)
(1) Department of Electrical Engineering & Information Technology, UGM, Yogyakarta
(2) Department of Electrical Engineering & Information Technology, UGM, Yogyakarta
(3) Department of Electrical Engineering & Information Technology, UGM, Yogyakarta
(*) Corresponding Author
Abstract
Keywords
Full Text:
PDFReferences
[1] Indonesian Ministry of Social, Pedoman Pendataan dan Pengelolaan Data Penyandang Masalah Kesejahteraan Sosial dan Potensi dan Sumber Kesejahteraan Sosial. Indonesia, 2012, pp. 1–7.
[2] R. S. H. Ellya Susilowati, Krisna Dewi, Meiti Subardhini, Dwi Yuliani, Tuti Kartika, Rini Hartini Rindra, “Kompetensi Pekerja Sosial dalam Pelaksanaan Tugas Respon Kasus Anak Berhadapan dengan Hukum di Cianjur,” PEKSOS J. Ilm. Pekerj. Sos., vol. 16, no. 1, pp. 71–87, 2017.
[3] L. J. Wan H., Ning B., Tao X., “Research on Chinese Short Text Clustering Ensemble via Convolutional Neural Networks,” in Artificial Intelligence in China, 2020, pp. 622–628.
[4] N. Saini, S. Saha, and P. Bhattacharyya, “Automatic Scientific Document Clustering Using Self-organized Multi-objective Differential Evolution,” Cognit. Comput., vol. 11, no. 2, pp. 271–293, 2019.
[5] R. A. H. M. Rupasingha and I. Paik, “Alleviating sparsity by specificity-aware ontology-based clustering for improving web service recommendation,” IEEJ Trans. Electr. Electron. Eng., vol. 14, no. 10, pp. 1507–1517, Oct. 2019.
[6] S. Kang et al., “Ontology-Based Ambiguity Resolution of Manufacturing Text for Formal Rule Extraction,” J. Comput. Inf. Sci. Eng., vol. 19, no. 2, Feb. 2019.
[7] R. Sandhiya and M. Sundarambal, “Clustering of biomedical documents using ontology-based TF-IGM enriched semantic smoothing model for telemedicine applications,” Cluster Comput., vol. 22, no. 2, pp. 3213–3230, 2019.
[8] X. Sun, X. Liu, B. Li, Y. Duan, H. Yang, and J. Hu, “Exploring topic models in software engineering data analysis: A survey,” in IEEE/ACIS 17th International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, SNPD 2016, 2016, pp. 357–362.
[9] C. Jacobi, W. Van Atteveldt, and K. Welbers, “Quantitative analysis of large amounts of journalistic texts using topic modelling,” Digit. Journal., vol. 4, no. 1, pp. 89–106, 2016.
[10] S. I. Nikolenko, S. Koltcov, and O. Koltsova, “Topic modelling for qualitative studies,” J. Inf. Sci., vol. 43, no. 1, pp. 88–102, 2017.
[11] M. Shovkun, K. R. Fleischmann, and B. Xie, “Computational social science using topic modeling: Analyzing patients’ values using a large hospital survey,” Proc. Assoc. Inf. Sci. Technol., vol. 55, no. 1, pp. 892–893, 2018.
[12] Y. H. Kee, C. Li, L. C. Kong, C. J. Tang, and K. L. Chuang, “Scoping Review of Mindfulness Research: a Topic Modelling Approach,” Mindfulness (N. Y)., vol. 10, no. 8, pp. 1474–1488, 2019.
[13] A. Onan, H. Bulut, and S. Korukoglu, “An improved ant algorithm with LDA-based representation for text document clustering,” J. Inf. Sci., vol. 43, no. 2, pp. 275–292, 2017.
[14] C. Li et al., “LDA Meets Word2Vec: A Novel Model for Academic Abstract Clustering Changzhou,” in WWW ’18 Companion April 23-27, 2018, Lyon, France., 2018, vol. 2, pp. 1699–1706.
[15] H. Ma and T. Zhang, “Research on policy text clustering algorithm based on LDA-Gibbs model,” J. Adv. Comput. Intell. Intell. Informatics, vol. 23, no. 2, pp. 268–273, 2019.
[16] H. Jelodar et al., “Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey,” Multimed. Tools Appl., vol. 78, no. 11, pp. 15169–15211, 2019.
[17] W. Etaiwi and G. Naymat, “The Impact of applying Different Preprocessing Steps on Review Spam Detection,” Procedia Comput. Sci., vol. 113, pp. 273–279, 2017.
[18] M. Petrović, Dorde and Stanković, “The Influence of Text Preprocessing Methods and Tools on Calculating Text Similarity,” Ser. Math. Inform., vol. 34, no. 5, pp. 973–994, 2019.
[19] A. Schofield and D. Mimno, “Comparing Apples to Apple: The Effects of Stemmers on Topic Models,” Trans. Assoc. Comput. Linguist., vol. 4, pp. 287–300, 2016.
[20] A. Schofield, M. Magnusson, and D. Mimno, “Pulling out the stops: Rethinking stopword removal for topic models,” 15th Conf. Eur. Chapter Assoc. Comput. Linguist. EACL 2017 - Proc. Conf., vol. 2, pp. 432–436, 2017.
[21] V. H. A. Soares, R. J. G. B. Campello, S. Nourashrafeddin, E. Milios, and M. C. Naldi, “Combining semantic and term frequency similarities for text clustering,” Knowl. Inf. Syst., vol. 61, no. 3, pp. 1485–1516, 2019.
[22] D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet allocation,” J. Mach. Learn. Res., vol. 3, no. 4–5, pp. 993–1022, 2003.
[23] M. V. Mantyla, M. Claes, and U. Farooq, “Measuring LDA topic stability from clusters of replicated runs,” Int. Symp. Empir. Softw. Eng. Meas., 2018.
[24] G. Xu, Y. Meng, Z. Chen, X. Qiu, C. Wang, and H. Yao, “Research on Topic Detection and Tracking for Online News Texts,” IEEE Access, vol. 7, pp. 58407–58418, 2019.
[25] S. K. Habibabadi and P. D. Haghighi, “Topic Modelling for Identification of Vaccine Reactions in Twitter,” ACM Int. Conf. Proceeding Ser., 2019.
[26] K. Stevens, P. Kegelmeyer, D. Andrzejewski, and D. Buttler, “Exploring topic coherence over many models and many topics,” EMNLP-CoNLL 2012 - 2012 Jt. Conf. Empir. Methods Nat. Lang. Process. Comput. Nat. Lang. Learn. Proc. Conf., no. July, pp. 952–961, 2012.
[27] K. Toros, D. M. DiNitto, and A. Tiko, “Family engagement in the child welfare system: A scoping review,” Child. Youth Serv. Rev., vol. 88, no. July 2016, pp. 598–607, 2018.
DOI: https://doi.org/10.22146/ijccs.54507
Article Metrics
Abstract views : 3482 | views : 2968Refbacks
- There are currently no refbacks.
Copyright (c) 2020 IJCCS (Indonesian Journal of Computing and Cybernetics Systems)
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
View My Stats1