An Efficient and Effective Model to Handle Missing Data in Classification

المؤلفون المشاركون

Ayatollahi, Seyyed Mohammad Taghi
Mehrabani-Zeinabad, Kamran
Doostfatemeh, Marziyeh

المصدر

BioMed Research International

العدد

المجلد 2020، العدد 2020 (31 ديسمبر/كانون الأول 2020)، ص ص. 1-11، 11ص.

الناشر

Hindawi Publishing Corporation

تاريخ النشر

2020-11-25

دولة النشر

مصر

عدد الصفحات

11

التخصصات الرئيسية

الطب البشري

الملخص EN

Missing data is one of the most important causes in reduction of classification accuracy.

Many real datasets suffer from missing values, especially in medical sciences.

Imputation is a common way to deal with incomplete datasets.

There are various imputation methods that can be applied, and the choice of the best method depends on the dataset conditions such as sample size, missing percent, and missing mechanism.

Therefore, the better solution is to classify incomplete datasets without imputation and without any loss of information.

The structure of the “Bayesian additive regression trees” (BART) model is improved with the “Missingness Incorporated in Attributes” approach to solve its inefficiency in handling the missingness problem.

Implementation of MIA-within-BART is named “BART.m”.

As the abilities of BART.m are not investigated in classification of incomplete datasets, this simulation-based study aimed to provide such resource.

The results indicate that BART.m can be used even for datasets with 90 missing present and more importantly, it diagnoses the irrelevant variables and removes them by its own.

BART.m outperforms common models for classification with incomplete data, according to accuracy and computational time.

Based on the revealed properties, it can be said that BART.m is a high accuracy model in classification of incomplete datasets which avoids any assumptions and preprocess steps.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Mehrabani-Zeinabad, Kamran& Doostfatemeh, Marziyeh& Ayatollahi, Seyyed Mohammad Taghi. 2020. An Efficient and Effective Model to Handle Missing Data in Classification. BioMed Research International،Vol. 2020, no. 2020, pp.1-11.
https://search.emarefa.net/detail/BIM-1137664

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Mehrabani-Zeinabad, Kamran…[et al.]. An Efficient and Effective Model to Handle Missing Data in Classification. BioMed Research International No. 2020 (2020), pp.1-11.
https://search.emarefa.net/detail/BIM-1137664

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Mehrabani-Zeinabad, Kamran& Doostfatemeh, Marziyeh& Ayatollahi, Seyyed Mohammad Taghi. An Efficient and Effective Model to Handle Missing Data in Classification. BioMed Research International. 2020. Vol. 2020, no. 2020, pp.1-11.
https://search.emarefa.net/detail/BIM-1137664

نوع البيانات

مقالات

لغة النص

الإنجليزية

الملاحظات

Includes bibliographical references

رقم السجل

BIM-1137664