Social activity data sets are increasing in number and volume. Finding community structure in such data is valuable in many applications. For example, understand ing the community structure of social networks may reduce the spread of epidemics or boost advertising revenue; discovering partitions in tra c networks can help to optimize routing and to reduce congestion; finding a group of users with common interests can allow a system to recommend useful items. Among many aspects, qual ity of inference and e ciency in finding community structures in such data sets are of paramount concern. In this thesis, we propose several approaches to improve com munity detection in these aspects.
The first approach utilizes the concept of Kcores to reduce the size of the problem. The Kcore of a graph is the largest subgraph within which each node has at least K connections. We propose a framework that accelerates community detection. It first applies a traditional algorithm that is relatively slow to the Kcore, and then uses a fast heuristic to infer community labels for the remaining nodes.
The second approach is to scale the algorithm to multiprocessor systems. We de vise a scalable community detection algorithm for large networks based on stochastic block models. It is an alternating iterative algorithm using a maximum likelihood ap proach. Compared with traditional inference algorithms for stochastic block models, our algorithm can scale to large networks and run on multiprocessor systems. The
time complexity is linear in the number of edges of the input network.
The third approach is to improve the quality. We propose a framework for non negative matrix factorization that allows the imposition of linear or approximately linear constraints on each factor. An example of the applications is to find community
structures in bipartite networks, which is useful in recommender systems.
Our algorithms are compared with the results in recent papers and their quality
and e ciency are verified by experiments.
Date of Award  May 19 2015 

Original language  English 

Awarding Institution   Computer, Electrical and Mathematical Science and Engineering


Supervisor  David Keyes (Supervisor) 

 Community Detection
 Scalability
 Parallel Computing
 Constrained NMF
 Social Networks