Topic Modeling of Online Media News Titles during COVID-19 Emergency Response in Indonesia Using the Latent Dirichlet Allocation (LDA) Algorithm

M Didik R Wahyudi, Agung Fatwanto, Usfita Kiftiyani, M. Galih Wonoseto

Abstract


Online media news portals have the advantage of speed in conveying information on any events that occur in society. One way to know what a story is about is from the title. The headline is a headline that introduces the reader's knowledge about the news content to be described. From these headlines, you can search for the main topics or trends that are being discussed. It takes a fast and efficient method to find out what topics are trending in the news. One method that can be used to overcome this problem is topic modeling. Topic modeling is necessary to help users quickly understand recent issues. One of the algorithms in topic modeling is Latent Dirichlet Allocation (LDA). The stages of this research began with data collection, preprocessing, forming n-grams, dictionary representation, weighting, validating the topic model, forming the topic model, and the results of topic modeling. The results of modeling LDA topics in news headlines taken from www.detik.com for 8 months (March-October 2020) during the COVID-19 pandemic showed that the best number of topics produced each month were 3 topics dominated by news topics about corona cases, positive corona, positive COVID, COVID-19 with an accuracy of 0.824 (82.4%). The resulting precision and recall values indicate that the two values are identical, so this is ideal for an information retrieval system.

Keywords


Text Mining; Media Analytics; Topic Modeling; LDA; COVID-19 news

Full Text:

Link Download

References


Bangun, E. P., A Koagouw, F. V. I., & Kalangi, J. S. (2019). Analisis isi unsur kelengkapan berita pada media online manadopostonline.com. Acta Diurna Komunikasi, 1(3), 4–13.

Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet Allocation. 3, 993–1022.

Destarani, A. R., Slamet, I., & Subanti, S. (2019). Trend Topic Analysis using Latent Dirichlet Allocation (LDA) (Study Case: Denpasar People’s Complaints Online Website). Jurnal Ilmiah Teknik Elektro Komputer Dan Informatika, 5(1), 50–58. https://doi.org/10.26555/jiteki.v5i1.13088

Hidayatullah, A. F. (2016). Pengaruh Stopword Terhadap Performa Klasifikasi Tweet Berbahasa Indonesia. JISKA (Jurnal Informatika Sunan Kalijaga), 1(1), 1–4. https://doi.org/http://dx.doi.org/10.14421/jiska.2016.11-01

Ja’far, A. N. (2018). Topic Modeling of APP Review in Google Play Based on Latent Dirichlet Allocation.

Keraf, G. (1980). Komposisi. Flores: Nusa Indah.

Kristanto, T. A. (2019). Media Cetak, Tak Cukup Dua Kaki. Jurnal Dewan Pers, 20(November), 9–17.

Kurniawan, W. (2018). Sistem Monitoing Pecakapan Pada Toko Online Menggunakan Metode Latent Dirichlet Allocation (LDA) Studi Kasus: Toko Online “BERRYBENKA.COM.”

Mohammed, S. H., & Al-Augby, S. (2020). LSA & LDA topic modeling classification: Comparison study on E-books. Indonesian Journal of Electrical Engineering and Computer Science, 19(1), 353–362. https://doi.org/10.11591/ijeecs.v19.i1.pp353-362

Newman, D., Lau, J. H., Grieser, K., & Baldwin, T. (2010). Automatic evaluation of topic coherence. NAACL HLT 2010 - Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Proceedings of the Main

Conference, June, 100–108.

Pao, M. L. (1990). Concepts of Information Retrieval . Miranda Lee Pao. In The Library Quarterly (Vol. 60, Issue 1). Libraries Unlimited. https://doi.org

/10.1086/602200

Putra, I. M. K. B. (2017). Analisis Topik Informasi Publik Media Sosial di Surabaya Menggunakan Pemodelan Latent Dirichlet Allocation (LDA).

Python Sastrawi. (n.d.). https://github.com/sastrawi/sastrawi

Röder, M., Both, A., & Hinneburg, A. (2015). Exploring the space of topic coherence measures. WSDM 2015 - Proceedings of the 8th ACM International Conference on Web Search and Data Mining, 399–408. https://doi.org/10.1145/2684822.2685324

Sastrawi. (n.d.). https://github.com/har07/PySastrawi

Shafiei, M. M. (2009). Leveraging structural information for statistical topic models of text.

Stehman, S. V. (1997). Selecting and interpreting measures of thematic classification accuracy. Remote Sensing of Environment, 62(1). https://doi.org/10.1016/S0034-4257(97)00083-7




DOI: http://dx.doi.org/10.35671/telematika.v14i2.1225

Refbacks

  • There are currently no refbacks.


 



Indexed by:

 

Telematika
ISSN: 2442-4528 (online) | ISSN: 1979-925X (print)
Published by : Universitas Amikom Purwokerto
Jl. Let. Jend. POL SUMARTO Watumas, Purwonegoro - Purwokerto, Indonesia


Creative Commons License This work is licensed under a Creative Commons Attribution 4.0 International License .