An improved algorithm for data preprocessing in mining crime data set

Author

al-Janabi, Kazim Burayhi Sawadi

Source

Journal of Kufa for Mathematics and Computer

Issue

Vol. 1, Issue 4 (31 Dec. 2011), pp.81-87, 7 p.

Publisher

University of Kufa Faculty of Mathematics and Computers Science

Publication Date

2011-12-31

Country of Publication

Iraq

No. of Pages

7

Main Subjects

Mathematics

Topics

Abstract EN

-This paper presents an improved algorithm for data preprocessing to solve the problem of missing values and smoothing the outliers in the real world data sets.

Previous works in this field are based mainly on replacing the missing values with the average, class average, most common values and some other techniques in the same direction, and outliers were generally cancelled from the data set.

Crime and criminal data sets have their own special characteristics and benchmark in that missing values and outliers have different meanings than in other fields, so they need to be processed in different manners.

The algorithm is based mainly on using clustering techniques to group the objects according to their similarities and dissimilarities, then smoothing the outliers accordingly and the missing values are processed according to their clusters.

WEKA is used as a tool to find different clusters of the criminals.

American Psychological Association (APA)

al-Janabi, Kazim Burayhi Sawadi. 2011. An improved algorithm for data preprocessing in mining crime data set. Journal of Kufa for Mathematics and Computer،Vol. 1, no. 4, pp.81-87.
https://search.emarefa.net/detail/BIM-307860

Modern Language Association (MLA)

al-Janabi, Kazim Burayhi Sawadi. An improved algorithm for data preprocessing in mining crime data set. Journal of Kufa for Mathematics and Computer Vol. 1, no. 4 (Dec. 2011), pp.81-87.
https://search.emarefa.net/detail/BIM-307860

American Medical Association (AMA)

al-Janabi, Kazim Burayhi Sawadi. An improved algorithm for data preprocessing in mining crime data set. Journal of Kufa for Mathematics and Computer. 2011. Vol. 1, no. 4, pp.81-87.
https://search.emarefa.net/detail/BIM-307860

Data Type

Journal Articles

Language

English

Notes

Includes bibliographical references : p. 86-87

Record ID

BIM-307860