IMPUTASI MISSING ATTRIBUTE VALUES DATASET HEPATITIS BERDASARKAN ALGORITME RIPPER

Tri Astuti, Yuliyanti Yuliyanti

Abstract


Hepatitis is a liver disease which caused by a hepatitis virus. Nowdays hepatitis is a global health problems, including in Indonesia. Chronic hepatitis can lead to cirrhosis and liver cancer, therefore early diagnosis is needed. The diagnosis process of hepatitis disease are done through computer aided method using hepatitis dataset nowdays. University California Irvine (UCI) machine learning repository has been providing hepatitis disease dataset which can be accessed to public but the dataset contains many missing values. The existing of missing values in the dataset may affect the quality of the analysis results, therefore it needs to be conducted for handling the missing values. Imputation method based on machine learning is one of the methods to handle the missing value. The aims of this research is to develop the imputation methods of missing value using machine learning algorithm based on RIPPER on hepatitis dataset. Result shows that the imputation method based on RIPPER achives 87,50% accuracy for hepatitis dataset. It is expected that the developed method can contribute for helping the clinicans and practicians by providing imputed hepatitis dataset in diagnosing the hepatitis disease.


Keywords


Hepatitis; missing values; imputation

References


Anonymous. 2014, “Penyakit Hepatitis A, Hepatitis B, Hepatitis C.” [Online]. Available: http://penyakithepatitis.org/. [Accessed: 25-Jun-2014].

Acuna,E. and Rodriguez,C., 2004. “The Treatment of Missing Values and its Effect in the Classifier Accuracy,” presented at the In Banks,D. et al. (eds) Classification, Clustering and Data Mining Applications, Springer-Verlag, Berlin, Heidelberg, pp. 639–648.

Aydilek ,I. B. and A. Arslan, 2013. “A hybrid method for imputation of missing values using optimized fuzzy c-means with support vector regression and a genetic algorithm,” Inf. Sci., vol. 233, pp. 25–35, Jun.

Bergmeir,C. and J. M. Benítez, 2012 “On the use of cross-validation for time series predictor evaluation,” Inf. Sci., vol. 191, pp. 192–213.

Chu, N., L. Ma, J. Li, P. Liu, and Y. Zhou, 2010 “Rough set based feature selection for improved differentiation of traditional Chinese medical data,” in 2010 Seventh International Conference on Fuzzy Systems and

Knowledge Discovery (FSKD), vol. 6, pp. 2667–2672.

Grabowski,Chester. 2011 “Chronic Hepatitis B Virus Infection,” presented at the Hepatitis B Health Conference, Amerika Serikat.

Genç, S. , F. E. Boran, D. Akay, and Z. Xu, 2010.“Interval multiplicative transitivity for consistency, missing values and priority weights of interval fuzzy preference relations,” Inf. Sci., vol. 180, no. 24, pp. 4877–4891, Dec.

Hulse, J. Van and T. M. Khoshgoftaar, 2007. “Incomplete-Case Nearest Neighbor Imputation in Software Measurement Data,” in IEEE International Conference on Information Reuse and Integration, 2007. IRI 2007, pp. 630–637.

Kemenkes RI. 2013.Badan Penelitian dan Pengembangan Kesehatan Republik Indonesia, “Riset Kesehatan Dasar 2013,”.

Martono G. Hendro, Teguh Bharata Adji, and N. A. Setiawan, 2012 “Penggunaan Metodologi Analisa Komponen Utama (PCA) untuk Mereduksi Faktor-Faktor yang Mempengaruhi Penyakit Jantung Koroner,” presented at the National Conference of Science, Engineering and Technology, 2012.

Nelwamondo, F. V. , D. Golding, and T. Marwala, 2013.“A dynamic programming approach to missing data estimation using neural networks,” Inf. Sci., vol. 237, pp. 49–58, Jul.

Othman, N. B. O. M. S. Bin , F. Binti Jusoh, N. Binti Omar, and R. Binti Ibrahim, 2010 “Review of Feture Selection for solving Classification Problem,” J. Inf. Syst. Res. Innov.,vol. 3, pp. 64–70.

Pous, C. , D. Caballero, and B. Lopez, 2008 “Diagnosing Patients Combining Principal Components Analysis and Case Based Reasoning,” in Eighth International Conference on Hybrid Intelligent Systems, 2008. HIS ’08, pp. 819–824.

Seftyawan,A. Itsnaini, D. Eka Ratnawati, and L. Muflikhah, 2013 “Penanganan Missing Value dengan Algoritma Weighted KNNI Pada Data Kategori,” DORO Repos. J. Mhs. PTIIK Univ. Brawijaya, vol. 1 No.2.

Thangavel, K. and A. Pethalakshmi, 2009 “Dimensionality reduction based on rough set theory: A review,” Appl. Soft Comput., vol. 9, no. 1, pp. 1–12.

UCI Repository. 2014. “Dataset UCI Machine Learning,” UCI Machine Learning Repository. [Online]. Available: http://archive.ics.uci.edu/ml/datasets.html. [Accessed: 08-Jun-2014].

Yongsong, Q. , Z. Shichao, Z. Xiaofeng, Z. Jilian, and Z. Chengqi, 2007.“Semi-parametric optimization for missing data imputation,” Appl. Intell., vol. 27(1), pp. 79–88., Jan.




DOI: http://dx.doi.org/10.35671/telematika.v8i2.395

Refbacks

  • There are currently no refbacks.




Indexed by:

     http://click.accelo.com/wf/click?upn=KMJOFt8368XHDV6m09YF-2BTGnIfzAj8ov81j3S3dKrgX-2FSP8SBOSe2Y-2FRl3XtyVdizj-2FkXxL-2F-2FBp-2BQ3h3JmTUMA-3D-3D_m-2BrHp932aZXzO0XgkbwedgKvn5QWlonE5sMgaivZdq7OsTVSTY4hEqzD-2Bq18nXAyLJBneuiZlt38H2UV92XxYUTcMxEriSXBXl4R62YQbqlgPCj4HTJTRlEeMBija8NFLIgPs2I1UuCR2UCZXSiKb2ocM6V4QaW-2FslHJUiSZesKuX9OlsnCNztILLyuQC4ZZvCegHVeQWDMYSYLvWzv-2FxgZ4v9s-2B2Ehf-2FEsLNi2Ea97Xe1t2vA4kmxioKhj90qGfUs7WlNUb-2B3FL0DjX8F4BTUuUiemqtsGMdQg-2By7qV9RY-3D      

Telematika

ISSN 2442-4528 (online) | ISSN 1979-925X (print)
Published by : Universitas Amikom Purwokerto
Jl. Let. Jend. POL SUMARTO Watumas, Purwonegoro - Purwokerto Telp (0281) 623321 Fax (0281) 621662
Email: telematika@amikompurwokerto.ac.id

Creative Commons License This work is licensed under a Creative Commons Attribution 4.0 International License.