Optimizing support vector machine classification based on semantic-text knowledge enrichment

Joint Authors

Diab, Shadi
Hamaydih, Nasim

Source

Palestinian Journal of Technology and Applied Sciences

Issue

Vol. 2019, Issue 2 (31 Jan. 2019), pp.142-151, 10 p.

Publisher

al-Quds Open University Deanship of Scientific Research and Graduate Studies

Publication Date

2019-01-31

Country of Publication

Palestine (West Bank)

No. of Pages

10

Main Subjects

Information Technology and Computer Science

Topics

Abstract EN

In this research, we enhanced the performance of Support Vector Machine (SVM) in text classification by applying semanticknowledge enrichment.

We propose using semantic-knowledge enrichment scheme to inject new concepts into the original contents of the text documents.

A pre-processing technique is proposed for cleaning and extracting features for generating semantic concepts through using WordNet database and the open source Natural Language Toolkit (NLTK).

Additionally, the combined online variation Bayes algorithm and the Latent Dirichlet Allocation model are used as a dimensionality reduction technique to generate abstract concepts from the raw text.

In our experiment, we clarified the process of preparing data for cleaning, transformation and weighting the features vectors in a multi-dimensional space as a step to measure the performance metrics of SVM, before and after applying our proposed approach on two different datasets.

K-Fold Cross-Validation technique is used to validate our proposed approach.

Moreover, a confusion matrix is implemented to measure the accuracy and macro-averages of precision, recall and f1 measurements.

The result of the evaluation showed improvements in term of accuracy from 94% to 98.3% for the dataset-1, and from 88% to 93% for dataset-2.

Moreover, the training time of the classifier in terms of seconds was reduced to 32% and 17% for dataset-1 and dataset-2 respectively, in comparison with the training time of the original data before applying our proposed enrichment scheme.

American Psychological Association (APA)

Diab, Shadi& Hamaydih, Nasim. 2019. Optimizing support vector machine classification based on semantic-text knowledge enrichment. Palestinian Journal of Technology and Applied Sciences،Vol. 2019, no. 2, pp.142-151.
https://search.emarefa.net/detail/BIM-860576

Modern Language Association (MLA)

Diab, Shadi& Hamaydih, Nasim. Optimizing support vector machine classification based on semantic-text knowledge enrichment. Palestinian Journal of Technology and Applied Sciences No. 2 (Jan. 2019), pp.142-151.
https://search.emarefa.net/detail/BIM-860576

American Medical Association (AMA)

Diab, Shadi& Hamaydih, Nasim. Optimizing support vector machine classification based on semantic-text knowledge enrichment. Palestinian Journal of Technology and Applied Sciences. 2019. Vol. 2019, no. 2, pp.142-151.
https://search.emarefa.net/detail/BIM-860576

Data Type

Journal Articles

Language

English

Notes

Text in English ; abstracts in English and Arabic.

Record ID

BIM-860576