Preceding document clustering by graph mining based maximal frequent termsets preservation

Joint Authors

Amjad, Muhammad
Shah, Sayyid Abd al-Baqi

Source

The International Arab Journal of Information Technology

Issue

Vol. 16, Issue 3 (31 May. 2019), pp.364-370, 7 p.

Publisher

Zarqa University

Publication Date

2019-05-31

Country of Publication

Jordan

No. of Pages

7

Main Subjects

Information Technology and Computer Science

Topics

Abstract EN

This paper presents an approach to cluster documents.

It introduces a novel graph mining based algorithm to find frequent termsets present in a document set.

The document set is initially mapped onto a bipartite graph.

Based on the results of our algorithm, the document set is modified to reduce its dimensionality.

Then, Bisecting K-means algorithm is executed over the modified document set to obtain a set of very meaningful clusters.

It has been shown that the proposed approach, Clustering preceded by Graph Mining based Maximal Frequent Termsets Preservation (CGFTP), produces better quality clusters than produced by some classical document clustering algorithm(s).

It has also been shown that the produced clusters are easily interpretable.

The quality of clusters has been measured in terms of their F-measure.

American Psychological Association (APA)

Shah, Sayyid Abd al-Baqi& Amjad, Muhammad. 2019. Preceding document clustering by graph mining based maximal frequent termsets preservation. The International Arab Journal of Information Technology،Vol. 16, no. 3, pp.364-370.
https://search.emarefa.net/detail/BIM-894778

Modern Language Association (MLA)

Shah, Sayyid Abd al-Baqi& Amjad, Muhammad. Preceding document clustering by graph mining based maximal frequent termsets preservation. The International Arab Journal of Information Technology Vol. 16, no. 3 (May. 2019), pp.364-370.
https://search.emarefa.net/detail/BIM-894778

American Medical Association (AMA)

Shah, Sayyid Abd al-Baqi& Amjad, Muhammad. Preceding document clustering by graph mining based maximal frequent termsets preservation. The International Arab Journal of Information Technology. 2019. Vol. 16, no. 3, pp.364-370.
https://search.emarefa.net/detail/BIM-894778

Data Type

Journal Articles

Language

English

Notes

Includes bibliographical references : p. 369-370

Record ID

BIM-894778