Text Summarization in Multi Document Using Genetic Algorithm

https://doi.org/10.22146/ijccs.66026

Nirwana Hendrastuty(1*), Azhari SN(2)

(1) Gadjah Mada University
(2) Gadjah Mada University
(*) Corresponding Author

Abstract


Automatic text summarization is a representation of a document that contains the essence or main focus of the document. Text summarization is automatically performed using the extraction method. The extraction method summarizes by copying the text that is considered the most important or most informative from the source text into a summary [1]. Documents can be divided into two types, namely single documents and multi documents. Multi document is input that comes from many documents from one or more sources that have more than one main idea.

This study aims to summarize the text using a Genetic Algorithm by paying attention to the extraction of text features on each chromosome. The feature extraction used is sentence position, positive keywords, negative keywords, similarity between sentences, sentences containing entity words, sentences containing numbers, sentence length, connections between sentences, the number of connections between sentences. The number of chromosomes used is half of the number of public complaints. The data used is data on public complaints against the DIY government from February 2018 to July 2020. The data is obtained from the e-lapor DIY website. From the test results, the average value of Precision 1, Recall is 0.71, and f-measure value is 0.79.


Keywords


Automatic Text Sumarization, Feature Extraction , DIY government, Genetic Algorithm.

Full Text:

PDF



DOI: https://doi.org/10.22146/ijccs.66026

Article Metrics

Abstract views : 2388 | views : 2634

Refbacks

  • There are currently no refbacks.




Copyright (c) 2021 IJCCS (Indonesian Journal of Computing and Cybernetics Systems)

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.



Copyright of :
IJCCS (Indonesian Journal of Computing and Cybernetics Systems)
ISSN 1978-1520 (print); ISSN 2460-7258 (online)
is a scientific journal the results of Computing
and Cybernetics Systems
A publication of IndoCEISS.
Gedung S1 Ruang 416 FMIPA UGM, Sekip Utara, Yogyakarta 55281
Fax: +62274 555133
email:ijccs.mipa@ugm.ac.id | http://jurnal.ugm.ac.id/ijccs



View My Stats1
View My Stats2