Practical characterization of large networks using neighborhood information

Pinghui Wang, Junzhou Zhao, Bruno Ribeiro, John C. S. Lui, Don Towsley, Xiaohong Guan

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Characterizing large complex networks such as online social networks through node querying is a challenging task. Network service providers often impose severe constraints on the query rate, hence limiting the sample size to a small fraction of the total network of interest. Various ad hoc subgraph sampling methods have been proposed, but many of them give biased estimates and no theoretical basis on the accuracy. In this work, we focus on developing sampling methods for large networks where querying a node also reveals partial structural information about its neighbors. Our methods are optimized for NoSQL graph databases (if the database can be accessed directly), or utilize Web APIs available on most major large networks for graph sampling. We show that our sampling method has provable convergence guarantees on being an unbiased estimator, and it is more accurate than state-of-the-art methods. We also explore methods to uncover shortest paths between a subset of nodes and detect high degree nodes by sampling only a small fraction of the network of interest. Our results demonstrate that utilizing neighborhood information yields methods that are two orders of magnitude faster than state-of-the-art methods.
Original languageEnglish (US)
Pages (from-to)701-728
Number of pages28
JournalKnowledge and Information Systems
Volume58
Issue number3
DOIs
StatePublished - Feb 14 2018

Fingerprint Dive into the research topics of 'Practical characterization of large networks using neighborhood information'. Together they form a unique fingerprint.

Cite this