MapReduce Based Personalized Locality Sensitive Hashing for Similarity Joins on Large Scale Data

Joint Authors

Wang, Jingjing
Lin, Chen

Source

Computational Intelligence and Neuroscience

Issue

Vol. 2015, Issue 2015 (31 Dec. 2015), pp.1-13, 13 p.

Publisher

Hindawi Publishing Corporation

Publication Date

2015-04-30

Country of Publication

Egypt

No. of Pages

13

Main Subjects

Biology

Abstract EN

Locality Sensitive Hashing (LSH) has been proposed as an efficient techniquefor similarity joins for high dimensional data.

The efficiency and approximationrate of LSH depend on the number of generated false positive instances and falsenegative instances.

In many domains, reducing the number of false positives iscrucial.

Furthermore, in some application scenarios, balancing false positives andfalse negatives is favored.

To address these problems, in this paper we proposePersonalized Locality Sensitive Hashing (PLSH), where a new banding scheme isembedded to tailor the number of false positives, false negatives, and the sum ofboth.

PLSH is implemented in parallel using MapReduce framework to deal withsimilarity joins on large scale data.

Experimental studies on real and simulated dataverify the efficiency and effectiveness of our proposed PLSH technique, comparedwith state-of-the-art methods.

American Psychological Association (APA)

Wang, Jingjing& Lin, Chen. 2015. MapReduce Based Personalized Locality Sensitive Hashing for Similarity Joins on Large Scale Data. Computational Intelligence and Neuroscience،Vol. 2015, no. 2015, pp.1-13.
https://search.emarefa.net/detail/BIM-1057674

Modern Language Association (MLA)

Wang, Jingjing& Lin, Chen. MapReduce Based Personalized Locality Sensitive Hashing for Similarity Joins on Large Scale Data. Computational Intelligence and Neuroscience No. 2015 (2015), pp.1-13.
https://search.emarefa.net/detail/BIM-1057674

American Medical Association (AMA)

Wang, Jingjing& Lin, Chen. MapReduce Based Personalized Locality Sensitive Hashing for Similarity Joins on Large Scale Data. Computational Intelligence and Neuroscience. 2015. Vol. 2015, no. 2015, pp.1-13.
https://search.emarefa.net/detail/BIM-1057674

Data Type

Journal Articles

Language

English

Notes

Includes bibliographical references

Record ID

BIM-1057674