WASTK: A Weighted Abstract Syntax Tree Kernel Method for Source Code Plagiarism Detection

Joint Authors

Xu, Yanyan
Fu, Deqiang
Yu, Haoran
Yang, Boyang

Source

Scientific Programming

Issue

Vol. 2017, Issue 2017 (31 Dec. 2017), pp.1-8, 8 p.

Publisher

Hindawi Publishing Corporation

Publication Date

2017-02-13

Country of Publication

Egypt

No. of Pages

8

Main Subjects

Mathematics

Abstract EN

In this paper, we introduce a source code plagiarism detection method, named WASTK (Weighted Abstract Syntax Tree Kernel), for computer science education.

Different from other plagiarism detection methods, WASTK takes some aspects other than the similarity between programs into account.

WASTK firstly transfers the source code of a program to an abstract syntax tree and then gets the similarity by calculating the tree kernel of two abstract syntax trees.

To avoid misjudgment caused by trivial code snippets or frameworks given by instructors, an idea similar to TF-IDF (Term Frequency-Inverse Document Frequency) in the field of information retrieval is applied.

Each node in an abstract syntax tree is assigned a weight by TF-IDF.

WASTK is evaluated on different datasets and, as a result, performs much better than other popular methods like Sim and JPlag.

American Psychological Association (APA)

Fu, Deqiang& Xu, Yanyan& Yu, Haoran& Yang, Boyang. 2017. WASTK: A Weighted Abstract Syntax Tree Kernel Method for Source Code Plagiarism Detection. Scientific Programming،Vol. 2017, no. 2017, pp.1-8.
https://search.emarefa.net/detail/BIM-1203470

Modern Language Association (MLA)

Fu, Deqiang…[et al.]. WASTK: A Weighted Abstract Syntax Tree Kernel Method for Source Code Plagiarism Detection. Scientific Programming No. 2017 (2017), pp.1-8.
https://search.emarefa.net/detail/BIM-1203470

American Medical Association (AMA)

Fu, Deqiang& Xu, Yanyan& Yu, Haoran& Yang, Boyang. WASTK: A Weighted Abstract Syntax Tree Kernel Method for Source Code Plagiarism Detection. Scientific Programming. 2017. Vol. 2017, no. 2017, pp.1-8.
https://search.emarefa.net/detail/BIM-1203470

Data Type

Journal Articles

Language

English

Notes

Includes bibliographical references

Record ID

BIM-1203470