Prediction of O-glycosylation site using pre-trained language model and machine learning
المؤلفون المشاركون
Rushdi, Muhammad
Salim, Abd al-Badi M.
al-Khulani, al-Hasan
Jad, Wala H.
المصدر
International Journal of Intelligent Computing and Information Sciences
العدد
المجلد 23، العدد 1 (31 مارس/آذار 2023)، ص ص. 41-52، 12ص.
الناشر
جامعة عين شمس كلية الحاسبات و المعلومات
تاريخ النشر
2023-03-31
دولة النشر
مصر
عدد الصفحات
12
التخصصات الرئيسية
تكنولوجيا المعلومات وعلم الحاسوب
الموضوعات
الملخص EN
O-glycosylation is a typical type of protein post-translational modifications (PTMs), which is linked to several diseases and has significant roles in many biological processes.
identification of O-glycosylation sites is important to know the mechanism of the O-glycosylation process.
However, the identification of PTM sites by laboratory experimental tools is time and money-consuming.
thus, the utilization of computational and artificial intelligence is becoming essential to predict o-glycosylation sites.
in this paper, we proposed a new model to improve O-glycosylation site prediction using a transformer-based protein language model and machine learning.
the dataset was collected and prepared from a recent data source called OGP (O-glycoprotein repository).
the TAPE (tasks assessing protein embeddings) protein language model was used to feature extraction from the peptide sequences using the embedding strategy.
then, feature selection was implemented using the linear support vector machine (SVM) to select informative features.
the XGBoost ensemble-based machine learning method was utilized for classification and prediction.
the proposed model achieved high-performance results with 0.7761 accuracy, 0.7391 sensitivity, 0.8130 specificity, 0.8295 AUC, and 0.5537 MCC when compared with the traditional machine learning methods.
on an independent dataset, the proposed method performed better than the latest available methods for predicting o-glycosylation sites.
نمط استشهاد جمعية علماء النفس الأمريكية (APA)
al-Khulani, al-Hasan& Jad, Wala H.& Rushdi, Muhammad& Salim, Abd al-Badi M.. 2023. Prediction of O-glycosylation site using pre-trained language model and machine learning. International Journal of Intelligent Computing and Information Sciences،Vol. 23, no. 1, pp.41-52.
https://search.emarefa.net/detail/BIM-1460750
نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)
Jad, Wala H.…[et al.]. Prediction of O-glycosylation site using pre-trained language model and machine learning. International Journal of Intelligent Computing and Information Sciences Vol. 23, no. 1 (Mar. 2023), pp.41-52.
https://search.emarefa.net/detail/BIM-1460750
نمط استشهاد الجمعية الطبية الأمريكية (AMA)
al-Khulani, al-Hasan& Jad, Wala H.& Rushdi, Muhammad& Salim, Abd al-Badi M.. Prediction of O-glycosylation site using pre-trained language model and machine learning. International Journal of Intelligent Computing and Information Sciences. 2023. Vol. 23, no. 1, pp.41-52.
https://search.emarefa.net/detail/BIM-1460750
نوع البيانات
مقالات
لغة النص
الإنجليزية
الملاحظات
Includes bibliographical references : p. 50-52
رقم السجل
BIM-1460750
قاعدة معامل التأثير والاستشهادات المرجعية العربي "ارسيف Arcif"
أضخم قاعدة بيانات عربية للاستشهادات المرجعية للمجلات العلمية المحكمة الصادرة في العالم العربي
تقوم هذه الخدمة بالتحقق من التشابه أو الانتحال في الأبحاث والمقالات العلمية والأطروحات الجامعية والكتب والأبحاث باللغة العربية، وتحديد درجة التشابه أو أصالة الأعمال البحثية وحماية ملكيتها الفكرية. تعرف اكثر