Optimizing Clustering of Indonesian Text Data Using Particle Swarm Optimization Algorithm: A Case Study of the Quran Translation

M Didik R Wahyudi, Agung Fatwanto

Abstract


The Quran considered the holy book for Muslims, contains scientific and historical facts affirming Islam's truth, beauty, and influence on human life. Consequently, the Quran text and its translations are valuable sources for text mining research, particularly for studying the interrelationship of its verses. One approach to grouping objects using certain algorithms is clustering, with K-Means Clustering being a prominent example. However, clustering results are often suboptimal due to the random selection of centroids. To address this, the study proposes using the Particle Swarm Optimization (PSO) algorithm, which selects centroids based on PSO results. The hybrid PSO algorithm initiates a single iteration of the K-means algorithm. It concludes either upon reaching the maximum iteration limit or when the average shift in the center of the mass vector falls below 0.0001. Evaluation of the clustering results from the three models indicates that the K-Means algorithm produced the lowest Sum of Squared Error (SSE) value of 1032.19. Additionally, the hybrid PSO algorithm generated the highest Silhouette value of 0.258 and the lowest quantization value of 0.00947. Further evaluation using a confusion matrix showed that K-Means clustering had an accuracy rate of 81.7%, K-Means with PSO had 82.5%, and the combination of K-Means with hybrid PSO yielded the highest accuracy rate of 91.1% among the three grouping models.

Keywords


Text_Mining; K-Means; Particle Swarm Optimization; Quran Mining; Quran Translation

Full Text:

Link Download

References


Abualigah, L. M., Khader, A. T. and Hanandeh, E. S. (2018) 'A new feature selection method to improve the document clustering using particle swarm optimization algorithm', Journal of Computational Science, 25(September), pp. 456–466. doi: 10.1016/j.jocs.2017.07.018.

Azmi, A.M., Al-Qabbany, A. O. & H. (2019) 'A Computational and natural language processing based studies of hadith literature: a survey'. doi: 10.1007/s10462-019-09692-w.

Ballardini, A. L. (2018) 'A tutorial on Particle Swarm Optimization Clustering'. doi: https://doi.org/10.48550/arXiv.1809.01942.

Basari, A. S. H. et al. (2013) 'Opinion mining of movie review using hybrid method of support vector machine and particle swarm optimization', Procedia Engineering, 53, pp. 453–462. doi: 10.1016/j.proeng.2013.02.059.

Bisilisin, F. Y., Herdiyeni, Y. and Silalahi, B. P. (2017) ‘Optimasi K-Means Clustering Menggunakan Particle Swarm Optimization pada Sistem Identifikasi Tumbuhan Obat Berbasis Citra’, Jurnal Ilmu Komputer dan Agri-Informatika, 3(1), p. 37. doi: 10.29244/jika.3.1.37-46.

Boulaouali, T. (2021) 'Quran Translation: A Historical-Theological Exploration', International Journal of Islamic Thought, 19. doi: 10.24035/ijit.19.2021.202.

Gao, H., Li, Y., Kabalyants, P.S., Xu, H., & Martínez-Béjar, R. (2020) 'A Novel Hybrid PSO-K-Means Clustering Algorithm Using Gaussian Estimation of Distribution Method and Lévy Flight', IEEE Access, pp. 122848–122863.

Hariyanto, R. and Zoqi Sarwani, M. (2021) ‘Optimizing K-Means Algorithm by Using Particle Swarm Optimization in Clustering for Students Learning Process’, Jurnal Ilmiah Bidang Teknologi Informasi dan Komunikasi, 6(1). doi: 10.25139/inform.v6i1.3459.

Hidayat, R., & Minati, S. (2019) 'Comparative Analysis of Text Mining Classification Algorithms for English and Indonesian Qur'an Translation', IJID (International Journal on Informatics for Development), 8(1), pp. 47–51. doi: 10.14421/ijid.2019.08108.

Kaur, P. (2017) 'Outlier Detection Using Kmeans and Fuzzy Min Max Neural Network in Network Data', Proceedings - 2016 8th International Conference on Computational Intelligence and Communication Networks, CICN 2016, pp. 693–696. doi: 10.1109/CICN.2016.142.

Mano, A. (2020) 'A Novel Approach based on PSO Optimized K-Means in MRI Brain Image Segmentation'. doi: 10.36227/techrxiv.12310100.v2.

Van Der Merwe, D. W. and Engelbrecht, A. P. (2003) 'Data clustering using particle swarm optimization', 2003 Congress on Evolutionary Computation, CEC 2003 - Proceedings, 1, pp. 215–220. doi:10.1109/CEC.2003.1299577.

Niu, Q. and Huang, X. (2011) 'An improved fuzzy C-means clustering algorithm based on PSO', Journal of Software, 6(5), pp. 873–879. doi: 10.4304/jsw.6.5.873-879.

R.Wahyudi, M. D. (2019) ‘Penerapan Algoritma Cosine Similarity pada Text Mining Terjemah Al- Qur’ an Berdasarkan Keterkaitan Topik’, Semesta Teknika, 22(1), pp. 41–50. doi: 10.18196/st.221235.

R Wahyudi, M. D. (2021) 'Evaluation of TF-IDF Algorithm Weighting Scheme in The Qur'an Translation Clustering with K-Means Algorithm', Journal of Information Technology and Computer Science, 6(2), pp. 117–129. doi: 10.25126/jitecs.202162295.

Rustam, S., Santoso, H. A. and Supriyanto, C. (2018) ‘Optimasi K-Means Clustering Untuk Identifikasi Daerah Endemik Penyakit Menular Dengan Algoritma Particle Swarm Optimization Di Kota Semarang’, ILKOM Jurnal Ilmiah, 10(3), pp. 251–259. doi: 10.33096/ilkom.v10i3.342.251-259.

Wahyudi, M. D. R. (2020) 'Evaluation of TF-IDF Algorithm Weighting Scheme in The Qur'an Translation Clustering with K-Means Algorithm'. doi:10.25126/jitecs.202162295




DOI: http://dx.doi.org/10.35671/telematika.v17i1.2724

Refbacks

  • There are currently no refbacks.


 



Indexed by:

 

Telematika
ISSN: 2442-4528 (online) | ISSN: 1979-925X (print)
Published by : Universitas Amikom Purwokerto
Jl. Let. Jend. POL SUMARTO Watumas, Purwonegoro - Purwokerto, Indonesia


Creative Commons License This work is licensed under a Creative Commons Attribution 4.0 International License .