Triangle Fixing Algorithms for the Metric Nearness Problem

Various problems in machine learning, databases, and statistics involve pairwise distances among a set of objects. It is often desirable for these distances to satisfy the properties of a metric, especially the triangle inequality. Applications where metric data is useful include clustering, classification, metric-based indexing, and approximation algorithms for various graph problems. This paper presents the Metric Nearness Problem: Given a dissimilarity matrix, find the "nearest" matrix of distances that satisfy the triangle inequalities. For lp nearness measures, this paper develops efficient triangle fixing algorithms that compute globally optimal solutions by exploiting the inherent structure of the problem. Empirically, the algorithms have time and storage costs that are linear in the number of triangle constraints. The methods can also be easily parallelized for additional speed.

[1]  Piotr Indyk,et al.  Sublinear time algorithms for metric space problems , 1999, STOC '99.

[2]  O. Mangasarian Normal solutions of linear programs , 1984 .

[3]  M. O. Dayhoff,et al.  22 A Model of Evolutionary Change in Proteins , 1978 .

[4]  Y. Censor,et al.  Parallel Optimization: Theory, Algorithms, and Applications , 1997 .

[5]  Daniel P. Miranker,et al.  A metric model of amino acid substitution , 2004, Bioinform..

[6]  R. J. Paul,et al.  Optimization Theory: The Finite Dimensional Case , 1977 .

[7]  Claire Mathieu,et al.  A Randomized Approximation Scheme for Metric MAX-CUT , 2001, J. Comput. Syst. Sci..

[8]  N. Higham MATRIX NEARNESS PROBLEMS AND APPLICATIONS , 1989 .

[9]  Kenneth Steiglitz,et al.  Combinatorial Optimization: Algorithms and Complexity , 1981 .

[10]  Robert G. Jeroslow,et al.  Review: Magnus R. Hestenes, Optimization theory, the finite dimensional case , 1977 .

[11]  Joachim M. Buhmann,et al.  Going Metric: Denoising Pairwise Data , 2002, NIPS.

[12]  Inderjit S. Dhillon,et al.  The Metric Nearness Problem with Applications , 2003 .

[13]  Andrzej Stachurski,et al.  Parallel Optimization: Theory, Algorithms and Applications , 2000, Scalable Comput. Pract. Exp..

[14]  M. O. Dayhoff,et al.  Atlas of protein sequence and structure , 1965 .

[15]  Heinz H. Bauschke,et al.  Dykstras algorithm with bregman projections: A convergence proof , 2000 .

[16]  Heinz H. Bauschke,et al.  Legendre functions and the method of random Bregman projections , 1997 .