Improved VSM based candidate retrieval model for detecting external textual plagiarism

Other Title(s)

نموذج استرجاع محسن مستند على VSM لاكتشاف الاستلال النصي الخارجي

Joint Authors

Muhammad, Muhannad Taha
Ibrahim, Abd Allah Adil
Kazim, Nisrin Jawad

Source

Iraqi Journal of Science

Issue

Vol. 60, Issue 10 (31 Oct. 2019), pp.2257-2268, 12 p.

Publisher

University of Baghdad College of Science

Publication Date

2019-10-31

Country of Publication

Iraq

No. of Pages

12

Main Subjects

Information Technology and Computer Science

Abstract EN

A rapid growth has occurred for the act of plagiarism with the aid of Internet explosive growth wherein a massive volume of information offered with effortless use and access makes plagiarism the process of taking someone else’s work (represented by ideas, or even words) and representing it as other's own work easy to be performed.

For ensuring originality, detecting plagiarism has been massively necessitated in various areas so that the people who aim to plagiarize ought to offer considerable effort for introducing works centered on their research.

In this paper, work has been proposed for improving the detection of textual plagiarism through proposing a model for candidate retrieval phase.

The model proposed for retrieving candidates has adopted the vector space method VSM as a retrieval model and centered on representing documents as vectors consisting of average term weights and considering them as queries for retrieval instead of representing them as vectors of term weight.

The detailed comparison task comes as the second phase wherein fuzzy semantic based string similarity has been applied.

Experiments have been conducted using PAN-PC-10 as an evaluation dataset for evaluating the proposed system.

As the problem statement in this paper is restricted to detect extrinsic plagiarism and works on English documents, experiments have been performed on the portion dedicated to extrinsic detection and on documents in English language only.

For evaluating performance of the proposed model for retrieving candidates, Precision, Recall, and F-measure have been used as an evaluation metrics.

The overall performance of the proposed system has been assessed through the use of the five standard PAN measures Precision, Recall, F-measure, Granularity and .

The experimental results have clarified that the proposed model for retrieving candidates has a positive impact on the overall performance of the system and the system outperforms the other state-of-the-art methods.

They clarified that the proposed model has detected about 80% of the plagiarism cases and about 90% of the detections were correct.

The proposed model has the ability to detect literal plagiarism in addition to cases containing paraphrasing.

Performance comparison has clarified that the proposed system is either comparable or outperforms the other baseline systems in terms of the five evaluation metrics.

American Psychological Association (APA)

Muhammad, Muhannad Taha& Kazim, Nisrin Jawad& Ibrahim, Abd Allah Adil. 2019. Improved VSM based candidate retrieval model for detecting external textual plagiarism. Iraqi Journal of Science،Vol. 60, no. 10, pp.2257-2268.
https://search.emarefa.net/detail/BIM-969521

Modern Language Association (MLA)

Muhammad, Muhannad Taha…[et al.]. Improved VSM based candidate retrieval model for detecting external textual plagiarism. Iraqi Journal of Science Vol. 60, no. 10 (2019), pp.2257-2268.
https://search.emarefa.net/detail/BIM-969521

American Medical Association (AMA)

Muhammad, Muhannad Taha& Kazim, Nisrin Jawad& Ibrahim, Abd Allah Adil. Improved VSM based candidate retrieval model for detecting external textual plagiarism. Iraqi Journal of Science. 2019. Vol. 60, no. 10, pp.2257-2268.
https://search.emarefa.net/detail/BIM-969521

Data Type

Journal Articles

Language

English

Notes

Includes bibliographical references : p. 2268

Record ID

BIM-969521