Efficient locality-sensitive hashing over high-dimensional data streams

Chengcheng Yang, Dong Deng, Shuo Shang, Ling Shao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Approximate Nearest Neighbor (ANN) search in high-dimensional space is a fundamental task in many applications. Locality-Sensitive Hashing (LSH) is a well-known methodology to solve the ANN problem with theoretical guarantees and empirical performance. We observe that existing LSH-based approaches target at the problem of designing search optimized indexes, which require a number of separate indexes and high index maintenance overhead, and hence impractical for high-dimensional streaming data processing. In this paper, we present PDA-LSH, a novel and practical disk-based LSH index that can offer efficient support for both updates and searches. Experiments on real-world datasets show that our proposal outperforms the state-of-the-art schemes by up to 10× on update performance and up to 2× on search performance.
Original languageEnglish (US)
Title of host publication2020 IEEE 36th International Conference on Data Engineering (ICDE)
PublisherIEEE
Pages1986-1989
Number of pages4
ISBN (Print)9781728129037
DOIs
StatePublished - May 27 2020

Fingerprint Dive into the research topics of 'Efficient locality-sensitive hashing over high-dimensional data streams'. Together they form a unique fingerprint.

Cite this