Investigating the efficiency of wordnet as background knowledge for document clustering

Joint Authors

Nafi, Rami
al-agah, Iyad

Source

Journal of Engineering Research and Technology

Issue

Vol. 2, Issue 2 (30 Jun. 2015), pp.152-158, 7 p.

Publisher

The Islamic University-Gaza Deanship of Research and Graduate Affairs

Publication Date

2015-06-30

Country of Publication

Palestine (Gaza Strip)

No. of Pages

7

Main Subjects

Economics & Business Administration

Abstract EN

Traditional techniques of document clustering do not consider the semantic relationships between words when assigning documents to clusters.

For instance, if two documents talk about the same topic but by using different words, these techniques may assign documents to different clusters.

Many efforts have approached this problem by enriching the document’s representation with background knowledge from WordNet.

These efforts, however, often showed conflicting results: While some researches claimed that WordNet had the potential to improve the clustering performance by its capability to capture and estimate similarities between words, other researches claimed that WordNet provided little or no enhancement to the obtained clusters.

This work aims to experimentally resolve this contradiction between the two teams, and explain why WordNet could be useful in some cases while not in others, and what factors can influence the use of WordNet for document clustering.

We conducted a set of experiments in which WordNet was used for document clustering with various settings including different datasets, different ways of incorporating semantics into the document’s representation and different similarity measures.

Results showed that different experimental settings may yield different clusters: For example, the influence of WordNet’s semantic features varies according to the dataset being used.

Results also revealed that WordNet-based similarity measures do not seem to improve clustering, and that there was no certain measure to ensure the best clustering results.

American Psychological Association (APA)

al-agah, Iyad& Nafi, Rami. 2015. Investigating the efficiency of wordnet as background knowledge for document clustering. Journal of Engineering Research and Technology،Vol. 2, no. 2, pp.152-158.
https://search.emarefa.net/detail/BIM-717483

Modern Language Association (MLA)

al-agah, Iyad& Nafi, Rami. Investigating the efficiency of wordnet as background knowledge for document clustering. Journal of Engineering Research and Technology Vol. 2, no. 2 (Jun. 2015), pp.152-158.
https://search.emarefa.net/detail/BIM-717483

American Medical Association (AMA)

al-agah, Iyad& Nafi, Rami. Investigating the efficiency of wordnet as background knowledge for document clustering. Journal of Engineering Research and Technology. 2015. Vol. 2, no. 2, pp.152-158.
https://search.emarefa.net/detail/BIM-717483

Data Type

Journal Articles

Language

English

Notes

Includes bibliographical references : p. 157-158

Record ID

BIM-717483