Optimizing Clustering of Indonesian Text Data Using Particle Swarm Optimization Algorithm: A Case Study of the Quran Translation

M Didik R Wahyudi, Agung Fatwanto


The Quran considered the holy book for Muslims, contains scientific and historical facts affirming Islam's truth, beauty, and influence on human life. Consequently, the Quran text and its translations are valuable sources for text mining research, particularly for studying the interrelationship of its verses. One approach to grouping objects using certain algorithms is clustering, with K-Means Clustering being a prominent example. However, clustering results are often suboptimal due to the random selection of centroids. To address this, the study proposes using the Particle Swarm Optimization (PSO) algorithm, which selects centroids based on PSO results. The hybrid PSO algorithm initiates a single iteration of the K-means algorithm. It concludes either upon reaching the maximum iteration limit or when the average shift in the center of the mass vector falls below 0.0001. Evaluation of the clustering results from the three models indicates that the K-Means algorithm produced the lowest Sum of Squared Error (SSE) value of 1032.19. Additionally, the hybrid PSO algorithm generated the highest Silhouette value of 0.258 and the lowest quantization value of 0.00947. Further evaluation using a confusion matrix showed that K-Means clustering had an accuracy rate of 81.7%, K-Means with PSO had 82.5%, and the combination of K-Means with hybrid PSO yielded the highest accuracy rate of 91.1% among the three grouping models.


Text_Mining; K-Means; Particle Swarm Optimization; Quran Mining; Quran Translation

