Diabetes prediction using feature selection and classification

Khyati K. Gandhi; Prof. Nilesh B.Prajapati

Authors

Khyati K. Gandhi PG Student, CE Department, BVM Engg. College, Vallabh Vidhyanagar, kvmehta108@gmail.com
Prof. Nilesh B.Prajapati IT Department, BVM Engg. College, Vallabh Vidhyanagar, nilesh.prajapati@bvmengineering.ac.in

Keywords:

Data mining, Feature selection, F-score, SVM classifier, K-means clustering.

Abstract

Medical data mining is becoming increasingly important in healthcare. The diversity of
medical data collected/stored for diagnosis and prognosis and the availability of widespread data mining
techniques to process these data place medical data mining in a unique position to truly impact patient
care using these stored data. Medical data are high dimensional in nature. It contains irrelevant and
redundant features that reduce prediction accuracy so data pre-processing is required to prepare data for
mining task. Feature selection has been an active and fruitful field of research and development for
decades in statistical machine learning, data mining. It is effective in enhancing learning efficiency,
increasing predictive accuracy, and reducing complexity of learned results. Feature selection is the preprocessing technique that selects optimal feature subset from whole features. F-score method and Kmeans clustering is used for feature selection. The performance of the SVM classifier is empirically
evaluated on the reduced feature subset of Pima Indian diabetes dataset is one of the standard dataset
available at UCI machine learning laboratory used for testing data mining algorithms to see their
prediction accuracy in diabetes data classification.

Diabetes prediction using feature selection and classification

Authors

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Make a Submission

downloads

Imp links

google

Current Issue

Information