FEATURE SELECTION USING EXTRA TREES CLASSIFIER FOR PARKINSON’S DISEASE CLASSIFICATION

Authors:

Gauri Sabherwal,Amandeep Kaur,Uday Malhotra,

DOI NO:

https://doi.org/10.26782/jmcms.spl.11/2024.05.00010

Keywords:

Parkinson's Disease,CSF,Feature Selection,Extra Tree Classifier,Machine Learning,Random Forest,Logistic Regression.,

Abstract

Parkinson's disease (PD) is chronic, permanent, and life-threatening. Neurologically protective treatments for PD rely on early detection. Recent studies have demonstrated that clinical data, cerebrospinal Fluid (CSF) based proteomes, and gene mutations are important biomarkers for accurate and early detection of PD. This study aims to investigate the heterogeneous data comprised of CSF-based clinical data, CSF-based proteomic analysis data as well as the mutation information of the genes, Glucose Beta Acid (GBA), leucine-rich kinase (LRRK2) to classify controls into PD-affected and Healthy Control (HC). The dataset contains 1103 controls (569 PD affected and 534 HC). Automated Machine Learning (AutoML) framework using PyCaret is utilized. The study has proposed an Extra Tree Classifier (ETC) as a feature selection mechanism to select features that significantly affect the PD classification. Selected features are further used to train Random Forest (RF), Logistic Regression (LR), and Decision Tree (DT) classifiers. Accuracy, sensitivity, specificity, area under the receiver operating characteristic curve (AUC-ROC), and the confusion matrix are used to evaluate the performance of classifiers. RF has depicted the best performance in terms of accuracy value of 96.12%, sensitivity of 95.59%, and specificity of 95.34% while LR has shown the highest AUC value of 98.33. RF has made the highest number of correct predictions 316 out of 331.

Refference:

I. Abdulhay Enas, N. Arunkumar, Kumaravelu Narasimhan, Elamaran Vellaiappan, and V. Venkatraman. : ‘Gait and tremor investigation using machine learning techniques for the diagnosis of Parkinson disease’. Future Generation Computer Systems. Vol. 83, pp. 366-373, (2018). 10.1016/j.future.2018.02.009
II. Bloem, Bastiaan R., Michael S. Okun, and Christine Klein. : ‘Parkinson’s disease’. The Lancet. Vol. 397, no. 10291, pp. 2284-2303, (2021). 10.1016/S0140-6736(21)00218-X
III. Bonifati Vincenzo. : ‘Genetics of Parkinson’s disease–state of the art, 2013’. Parkinsonism & related disorders. Vol. 20, pp. S23-S28, (2014). 10.1016/S1353-8020(13)70009-9
IV. Caramia Carlotta, Diego Torricelli, Maurizio Schmid, Adriana Muñoz-Gonzalez, Jose Gonzalez-Vargas, Francisco Grandas, and Jose L. Pons. : ‘IMU-based classification of Parkinson’s disease from gait: A sensitivity analysis on sensor location and feature selection’. IEEE journal of biomedical and health informatics. Vol. 22(6), pp. 1765-1774, (2018). 10.1109/JBHI.2018.2865218
V. Chapuis S., Ouchchane L., Metz O., Gerbaud L., & Durif F., : ‘Impact of the motor complications of Parkinson’s disease on the quality of life’. Movement disorders: official journal of the Movement Disorder Society. Vol. 20(2), pp. 224-230, (2005). 10.1002/mds.20279
VI. Dhiman Poonam, Vinay Kukreja, Poongodi Manoharan, Amandeep Kaur, M. M. Kamruzzaman, Imed Ben Dhaou, and Celestine Iwendi. : ‘A novel deep learning model for detection of severity level of the disease in citrus fruits’. Electronics. Vol. 11(3), pp. 495, (2022), 10.3390/electronics11030495
VII. Kaiser Sergio, Luqing Zhang, Brit Mollenhauer, Jaison Jacob, Simonne Longerich, Jorge Del-Aguila, Jacob Marcus et al., : ‘A proteogenomic view of Parkinson’s disease causality and heterogeneity’. npj Parkinson’s Disease. Vol. 9(1), pp. 24, (2023).10.1038/s41531-023-00461-9
VIII. Kaushal Chetna, and Anshu Singla. : ‘Automated segmentation technique with self‐driven post‐processing for histopathological breast cancer images’. CAAI Transactions on Intelligence Technology. Vol. 5(4), pp. 294-300, (2020). 10.1049/trit.2019.0077
IX. Kurt Imran, Mevlut Ture, and A. Turhan Kurum. : ‘Comparing performances of logistic regression, classification and regression tree, and neural networks for predicting coronary artery disease’. Expert systems with applications. Vol. 34(1), pp. 366-374, (2008). 10.1016/j.eswa.2006.09.004
X. Markello Ross D., Golia Shafiei, Christina Tremblay, Ronald B. Postuma, Alain Dagher, and Bratislav Misic. : ‘Multimodal phenotypic axes of Parkinson’s disease’. npj Parkinson’s Disease. Vol. 7(1), pp. 6, (2021). 10.1038/s41531-020-00144-9
XI. Oh Shu Lih, Yuki Hagiwara, U. Raghavendra, Rajamanickam Yuvaraj, N. Arunkumar, M. Murugappan, and U. Rajendra Acharya. : ‘A deep learning approach for Parkinson’s disease diagnosis from EEG signals’. Neural Computing and Applications. Vol. 32, (2020). 10927-10933. 10.1007/s00521-018-3689-5
XII. Pantaleo Ester, Alfonso Monaco, Nicola Amoroso, Angela Lombardi, Loredana Bellantuono, Daniele Urso, Claudio Lo Giudice et al. : ‘A machine learning approach to Parkinson’s disease blood transcriptomics’. Genes. Vol. 13(5), pp. 727, (2022). 10.3390/genes13050727
XIII. Parmar Aakash, Rakesh Katariya, and Vatsal Patel. : ‘A review on random forest: An ensemble classifier’. International conference on intelligent data communication technologies and internet of things (ICICI). pp. 758-763, 2018. Springer International Publishing. 2019. 10.1007/978-3-030-03146-6_86
XIV. Patrician Patricia A., : ‘Multiple imputation for missing data’. Research in nursing & health. Vol. 25(1), pp. 76-84, (2002). 10.1002/nur.10015
XV. Pol Urmila R., and Tejshree U. Sawant. : ‘Automl: Building An Classfication Model With Pycaret’. Ymer. Vol. 20, pp. 547-552, (2021).
XVI. Priyanka and Dharmender Kumar. : ‘Decision tree classifier: a detailed survey’. International Journal of Information and Decision Sciences. Vol. 12(3), pp. 246-269, (2020). 10.1504/IJIDS.2020.108141
XVII. Rasheed Jawad, Alaa Ali Hameed, Naim Ajlouni, Akhtar Jamil, Adem Özyavaş, and Zeynep Orman. : ‘Application of adaptive back-propagation neural networks for Parkinson’s disease prediction’. 2020 International Conference on Data Analytics for Business and Industry: Way Towards a Sustainable Economy (ICDABI), IEEE, pp. 1-5. 2020. 10.1109/ICDABI51230.2020.9325709
XVIII. Sabherwal G., Kaur A., : ‘Machine learning and deep learning approach to Parkinson’s disease detection: present state-of-the-art and a bibliometric review’. Multimedia Tools and Applications. (2024). 10.1007/s11042-024-18398-3
XIX. Sachdeva Ravi Kumar, Tushar Garg, Gagandeep Singh Khaira, Dikshant Mitrav, and Rakesh Ahuja. : ‘A Systematic Method for Lung Cancer Classification’. 10th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions)(ICRITO), 2022. IEEE, 2022. pp. 1-5. 10.1109/ICRITO56286.2022.9 964778
XX. Sakar C. Okan, and Olcay Kursun. ‘Telediagnosis of Parkinson’s disease using measurements of dysphonia’. Journal of medical systems. Vol. 34 pp. 591-599, (2010). 10.1007/s10916-009-9272-y
XXI. Schapira Anthony HV, K. Ray Chaudhuri, and Peter Jenner. : ‘Non-motor features of Parkinson disease’. Nature Reviews Neuroscience. Vol. 18(7), pp. 435-450, (2017). 10.1038/nrn.2017.62
XXII. Sharaff Aakanksha, and Harshil Gupta. : ‘Extra-tree classifier with metaheuristics approach for email classification’. Advances in Computer Communication and Computational Sciences: Proceedings of IC4S 2018. Springer Singapore, 2019. pp. 189-197. 10.1007/978-981-13-6861-5_17
XXIII. Shetty Sachin, and Y. S. Rao. : ‘SVM based machine learning approach to identify Parkinson’s disease using gait analysis’. 2016 International conference on inventive computation technologies (ICICT). IEEE, vol. 2, pp. 1-5. 2016. 10.1109/INVENTIVE.2016.7824836
XXIV. Sihombing Denny Jean Cross, Jawangi Unedo Dexius, Jonson Manurung, Mendarissan Aritonang, and Harni Seven Adinata. : ‘Design and Analysis of Automated Machine Learning (AutoML) in PowerBI Application Using PyCaret’. 2022 International Conference of Science and Information Technology in Smart Administration (ICSINTESA). IEEE. pp. 89-94. 2022. 10.1109/ICSINTESA56431.2022.10041543
XXV. Tsukita Kazuto, Haruhi Sakamaki-Tsukita, Sergio Kaiser, Luqing Zhang, Mirko Messa, Pablo Serrano-Fernandez, and Ryosuke Takahashi. : ‘High-throughput CSF proteomics and machine learning to identify proteomic signatures for Parkinson disease development and progression’. Neurology. Vol. 101(14), pp. e1434-e1447, (2023). 10.1212/WNL0000000000207725
XXVI. Yaman Orhan, Fatih Ertam, and Turker Tuncer. : ‘Automated Parkinson’s disease recognition based on statistical pooling method using acoustic features’. Medical Hypotheses. Vol. 135, 109483 (2020). 10.1016/j.mehy.2019. 109483
XXVII. Zhu Biqing, Dominic Yin, Hongyu Zhao, and Le Zhang. : ‘The immunology of Parkinson’s disease’. Seminars in Immunopathology, vol. 44(5), pp. 659-672. Berlin/Heidelberg: Springer Berlin Heidelberg, 2022. 10.1007/s00281-022-00947-3

View | Download