A new vector representation of short texts for classification

المؤلفون المشاركون

Liu, Bo
Li, Yangyang

المصدر

The International Arab Journal of Information Technology

العدد

المجلد 17، العدد 2 (31 مارس/آذار 2020)، ص ص. 241-249، 9ص.

الناشر

جامعة الزرقاء عمادة البحث العلمي

تاريخ النشر

2020-03-31

دولة النشر

الأردن

عدد الصفحات

9

التخصصات الرئيسية

العلوم الهندسية والتكنولوجية (متداخلة التخصصات)

الملخص EN

Short and sparse characteristics and synonyms and homonyms are main obstacles for short-text classification.

In recent years, research on short-text classification has focused on expanding short texts but has barely guaranteed the validity of expanded words.

This study proposes a new method to weaken these effects without external knowledge.

The proposed method analyses short texts by using the topic model based on Latent Dirichlet Allocation (LDA), represents each short text by using a vector space model and presents a new method to adjust the vector of short texts.

In the experiments, two open short-text data sets composed of google news and web search snippets are utilised to evaluate the classification performance and prove the effectiveness of our method.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Li, Yangyang& Liu, Bo. 2020. A new vector representation of short texts for classification. The International Arab Journal of Information Technology،Vol. 17, no. 2, pp.241-249.
https://search.emarefa.net/detail/BIM-954659

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Li, Yangyang& Liu, Bo. A new vector representation of short texts for classification. The International Arab Journal of Information Technology Vol. 17, no. 2 (Mar. 2020), pp.241-249.
https://search.emarefa.net/detail/BIM-954659

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Li, Yangyang& Liu, Bo. A new vector representation of short texts for classification. The International Arab Journal of Information Technology. 2020. Vol. 17, no. 2, pp.241-249.
https://search.emarefa.net/detail/BIM-954659

نوع البيانات

مقالات

لغة النص

الإنجليزية

الملاحظات

Includes bibliographical references : p. 248-249

رقم السجل

BIM-954659