CANCELED. We apologize for any inconvenience.
Many real world problems (biological, social, web) can be effectively modeled as networks or graphs where nodes represent entities of interest and edges mimic the interactions or relationships among them. The study of such complex relationship networks, recently referred to as "network science", can provide insight into their structure, properties and emergent behavior. Of particular interest here are rigorous methods for uncovering and understanding important network structures and motifs (communities) at multiple topological and temporal scales. Achieving this objective is challenging due to the presence of noise (false or missing interactions), topological (scale-free)) properties and scalability. Given the importance of the graph clustering problem, a number of solutions ranging from hierarchical methods to spectral methods have been designed and developed. To this mix of graph clustering algorithms one can also include a recent entrant Markov Clustering (MCL) a graph clustering algorithm based on stochastic flow simulation, and the focus of this work.
MCL has several advantages over competing techniques including an inherent simplicity, empirically confirmed robustness to noise effects and relatively non-parametric nature. However, in spite of its popularity in the bioinformatics community MCL has drawn limited attention from the broader social network, web and data mining communities primarily due to its limited scalability. In this talk we present two orthogonal strategies to address this problem - the first relies on developing a new multi-level regularized variant of MCL called MLR-MCL. The second approach focuses on data reduction via graph sparsification, in a manner analogous to the use of sampling, for improving the scalability of such algorithms. Empirical results demonstrate both qualitative as well as quantitative improvements over existing approaches on a wide range of domains.
Toward Scalable Stochastic Flow Algorithms [Canceled]