Designing Punjabi poetry classifiers using machine learning and different textual features

المؤلفون المشاركون

Kaur, Jasleen
Saini, Jatinderkumar

المصدر

The International Arab Journal of Information Technology

العدد

المجلد 17، العدد 1 (31 يناير/كانون الثاني 2020)، ص ص. 38-44، 7ص.

الناشر

جامعة الزرقاء عمادة البحث العلمي

تاريخ النشر

2020-01-31

دولة النشر

الأردن

عدد الصفحات

7

التخصصات الرئيسية

تكنولوجيا المعلومات وعلم الحاسوب

الموضوعات

الملخص EN

Analysis of poetic text is very challenging from computational linguistic perspective.

Computational analysis of literary arts, especially poetry, is very difficult task for classification.

For library recommendation system, poetries can be classified on various metrics such as poet, time period, sentiments and subject matter.

In this work, content-based Punjabi poetry classifier was developed using Weka toolset.

Four different categories were manually populated with 2034 poems Nature and Festival (NAFE), Linguistic and Patriotic (LIPA), Relation and Romantic (RORE), Philosophy and Spiritual (PHSP) categories consists of 505, 399, 529 and 601 numbers of poetries, respectively.

These poetries were passed to various pre-processing sub phases such as tokenization, noise removal, stop word removal, and special symbol removal.

31938 extracted tokens were weighted using Term Frequency (TF) and Term Frequency-Inverse Document Frequency (TF-IDF) weighting scheme.

Based upon poetry elements, three different textual features (lexical, syntactic and semantic) were experimented to develop classifier using different machine learning algorithms.

Naive Bayes (NB), Support Vector Machine, Hyper pipes and K-nearest neighbour algorithms were experimented with textual features.

The results revealed that semantic feature performed better as compared to lexical and syntactic.

The best performing algorithm is SVM and highest accuracy (76.02%) is achieved by incorporating semantic information associated with words.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Kaur, Jasleen& Saini, Jatinderkumar. 2020. Designing Punjabi poetry classifiers using machine learning and different textual features. The International Arab Journal of Information Technology،Vol. 17, no. 1, pp.38-44.
https://search.emarefa.net/detail/BIM-955148

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Kaur, Jasleen& Saini, Jatinderkumar. Designing Punjabi poetry classifiers using machine learning and different textual features. The International Arab Journal of Information Technology Vol. 17, no. 1 (Jan. 2020), pp.38-44.
https://search.emarefa.net/detail/BIM-955148

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Kaur, Jasleen& Saini, Jatinderkumar. Designing Punjabi poetry classifiers using machine learning and different textual features. The International Arab Journal of Information Technology. 2020. Vol. 17, no. 1, pp.38-44.
https://search.emarefa.net/detail/BIM-955148

نوع البيانات

مقالات

لغة النص

الإنجليزية

الملاحظات

Includes bibliographical references : p. 43-44

رقم السجل

BIM-955148