Parallel Double Greedy Submodular Maximization

Many machine learning problems can be reduced to the maximization of sub-modular functions. Although well understood in the serial setting, the parallel maximization of submodular functions remains an open area of research with recent results [1] only addressing monotone functions. The optimal algorithm for maximizing the more general class of non-monotone submodular functions was introduced by Buchbinder et al. [2] and follows a strongly serial double-greedy logic and program analysis. In this work, we propose two methods to parallelize the double-greedy algorithm. The first, coordination-free approach emphasizes speed at the cost of a weaker approximation guarantee. The second, concurrency control approach guarantees a tight 1/2-approximation, at the quantifiable cost of additional coordination and reduced parallelism. As a consequence we explore the tradeoff space between guaranteed performance and objective optimality. We implement and evaluate both algorithms on multi-core hardware and billion edge graphs, demonstrating both the scalability and tradeoffs of each approach.

[1]  András Frank,et al.  Submodular functions in graph theory , 1993, Discret. Math..

[2]  Seunghak Lee,et al.  More Effective Distributed ML via a Stale Synchronous Parallel Parameter Server , 2013, NIPS.

[3]  Rishabh K. Iyer,et al.  Fast Multi-stage Submodular Maximization , 2014, ICML.

[4]  Hui Lin,et al.  A Class of Submodular Functions for Document Summarization , 2011, ACL.

[5]  Alexander Schrijver,et al.  Combinatorial optimization. Polyhedra and efficiency. , 2003 .

[6]  Marco Rosa,et al.  Layered label propagation: a multiresolution coordinate-free ordering for compressing social networks , 2010, WWW.

[7]  Zoubin Ghahramani,et al.  Scaling the Indian Buffet Process via Submodular Maximization , 2013, ICML.

[8]  Andreas Krause,et al.  A Utility-Theoretic Approach to Privacy in Online Services , 2010, J. Artif. Intell. Res..

[9]  Takeo Kanade,et al.  Distributed cosegmentation via submodular optimization on anisotropic diffusion , 2011, 2011 International Conference on Computer Vision.

[10]  Jan Vondrák,et al.  Fast algorithms for maximizing submodular functions , 2014, SODA.

[11]  Alexander J. Smola,et al.  Scalable inference in latent variable models , 2012, WSDM '12.

[12]  Sergei Vassilvitskii,et al.  Fast Greedy Algorithms in MapReduce and Streaming , 2015, ACM Trans. Parallel Comput..

[13]  Michael I. Jordan,et al.  Optimistic Concurrency Control for Distributed Unsupervised Learning , 2013, NIPS.

[14]  Patrick Valduriez,et al.  Principles of Distributed Database Systems , 1990 .

[15]  Sebastiano Vigna,et al.  UbiCrawler: a scalable fully distributed Web crawler , 2004, Softw. Pract. Exp..

[16]  J. T. Robinson,et al.  On optimistic methods for concurrency control , 1979, TODS.

[17]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[18]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[19]  Stephen J. Wright,et al.  Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.

[20]  Sebastiano Vigna,et al.  The webgraph framework I: compression techniques , 2004, WWW '04.

[21]  L. Shapley Cores of convex games , 1971 .

[22]  Andreas Krause,et al.  Distributed Submodular Maximization: Identifying Representative Elements in Massive Data , 2013, NIPS.

[23]  Joseph Naor,et al.  A Tight Linear Time (1/2)-Approximation for Unconstrained Submodular Maximization , 2012, 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science.

[24]  Ben Taskar,et al.  Near-Optimal MAP Inference for Determinantal Point Processes , 2012, NIPS.

[25]  Aaron Q. Li,et al.  Parameter Server for Distributed Machine Learning , 2013 .

[26]  Andreas Krause,et al.  Submodularity and its applications in optimized information gathering , 2011, TIST.