Accurately Estimating User Cardinalities and Detecting Super Spreaders over Time

Peng Jia, Pinghui Wang, Yuchao Zhang, Xiangliang Zhang, Jing Tao, Jianwei Ding, Xiaohong Guan, Don Towsley

Research output: Contribution to journalArticlepeer-review

Abstract

Online monitoring user cardinalities in graph streams is fundamental for many applications such as anomaly detection. These graph streams may contain edge duplicates and have a large number of user-item pairs, which makes it infeasible to exactly compute user cardinalities due to limited computational and memory resources. Existing methods are designed to approximately estimate user cardinalities, but their accuracy highly depends on complex parameters and they cannot provide anytime-available estimation. To address these problems, we develop novel bit/register sharing algorithms, which use a bit/register array to build a compact sketch of all users' connected items. Our algorithms exploit the dynamic properties of the bit/register arrays (e.g., the fraction of zero bits in the bit array) to significantly improve the estimation accuracy, and have low time complexity O(1) to update the estimations for a new user-item pair. In addition, our algorithms are simple and easy to use, without requirements to tune any parameter. Furthermore, we extend our methods to detect super spreaders with large cardinalities in real-time. We evaluate the performance of our methods on real-world datasets. The experimental results demonstrate that our methods are several times more accurate and faster than state-of-the-art methods using the same amount of memory.
Original languageEnglish (US)
Pages (from-to)1-1
Number of pages1
JournalIEEE Transactions on Knowledge and Data Engineering
DOIs
StatePublished - 2020

Fingerprint Dive into the research topics of 'Accurately Estimating User Cardinalities and Detecting Super Spreaders over Time'. Together they form a unique fingerprint.

Cite this