The Prediction of Diatom Abundance by Comparison of Various Machine Learning Methods
المؤلفون المشاركون
Shin, Yuna
Lee, Heesuk
Lee, Young-Joo
Seo, Dae Keun
Jeong, Bomi
Hong, Seoksu
Kim, Jaehoon
Kim, Taekgeun
Lee, Jae-Kyeong
Heo, Tae-Young
المصدر
Mathematical Problems in Engineering
العدد
المجلد 2019، العدد 2019 (31 ديسمبر/كانون الأول 2019)، ص ص. 1-13، 13ص.
الناشر
Hindawi Publishing Corporation
تاريخ النشر
2019-05-27
دولة النشر
مصر
عدد الصفحات
13
التخصصات الرئيسية
الملخص EN
This study adopts two approaches to analyze the occurrence of algae at Haman Weir for Nakdong River; one is the traditional statistical method, such as logistic regression, while the other is machine learning technique, such as kNN, ANN, RF, Bagging, Boosting, and SVM.
In order to compare the performance of the models, this study measured the accuracy, specificity, sensitivity, and AUC, which are representative model evaluation tools.
The ROC curve is created by plotting association of sensitivity and (1-specificity).
The AUC that is area of ROC curve represents sensitivity and specificity.
This measure has two competitive advantages compared to other evaluation tools.
One is that it is scale-invariant.
It means that purpose of AUC is how well the model predicts.
The other is that the AUC is classification-threshold-invariant.
It shows that the AUC is independent of threshold because it is plotted association of sensitivity and (1-specificity) obtained by threshold.
We chose AUC as a final model evaluation tool with two advantages.
Also, variable selection was conducted using the Boruta algorithm.
In addition, we tried to distinguish the better model by comparing the model with the variable selection method and the model without the variable selection method.
As a result of the analysis, Boruta algorithm as a variable selection method suggested PO4-P, DO, BOD, NH3-N, Susp, pH, TOC, Temp, TN, and TP as significant explanatory variables.
A comparison was made between the model with and without these selected variables.
Among the models without variable selection method, the accuracy of RF analysis was highest, and ANN analysis showed the highest AUC.
In conclusion, ANN analysis using the variable selection method showed the best performance among the models with and without variable selection method.
نمط استشهاد جمعية علماء النفس الأمريكية (APA)
Shin, Yuna& Lee, Heesuk& Lee, Young-Joo& Seo, Dae Keun& Jeong, Bomi& Hong, Seoksu…[et al.]. 2019. The Prediction of Diatom Abundance by Comparison of Various Machine Learning Methods. Mathematical Problems in Engineering،Vol. 2019, no. 2019, pp.1-13.
https://search.emarefa.net/detail/BIM-1196190
نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)
Shin, Yuna…[et al.]. The Prediction of Diatom Abundance by Comparison of Various Machine Learning Methods. Mathematical Problems in Engineering No. 2019 (2019), pp.1-13.
https://search.emarefa.net/detail/BIM-1196190
نمط استشهاد الجمعية الطبية الأمريكية (AMA)
Shin, Yuna& Lee, Heesuk& Lee, Young-Joo& Seo, Dae Keun& Jeong, Bomi& Hong, Seoksu…[et al.]. The Prediction of Diatom Abundance by Comparison of Various Machine Learning Methods. Mathematical Problems in Engineering. 2019. Vol. 2019, no. 2019, pp.1-13.
https://search.emarefa.net/detail/BIM-1196190
نوع البيانات
مقالات
لغة النص
الإنجليزية
الملاحظات
Includes bibliographical references
رقم السجل
BIM-1196190
قاعدة معامل التأثير والاستشهادات المرجعية العربي "ارسيف Arcif"
أضخم قاعدة بيانات عربية للاستشهادات المرجعية للمجلات العلمية المحكمة الصادرة في العالم العربي
تقوم هذه الخدمة بالتحقق من التشابه أو الانتحال في الأبحاث والمقالات العلمية والأطروحات الجامعية والكتب والأبحاث باللغة العربية، وتحديد درجة التشابه أو أصالة الأعمال البحثية وحماية ملكيتها الفكرية. تعرف اكثر