On Optimal Data Compression in Multiterminal Statistical Inference

The multiterminal theory of statistical inference deals with the problem of estimating or testing the correlation of letters generated from two (or many) correlated information sources under the restriction of a certain transmission rate for each source. A typical example is two binary sources with joint probability <i>p</i>(<i>x</i>, <i>y</i>) where the correlation of <i>x</i> and <i>y</i> is to be tested or estimated. Given <i>n</i> iid observations <i>x</i><sup>n</sup> = <i>x</i><sub>1</sub> ...<i>x</i><sub>n</sub> and <i>y</i><sup>n</sup>=<i>y</i><sub>1</sub> ...<i>y</i><sup>n</sup>, only <i>k</i> = <i>rn</i> (0 <; <i>r</i> <; 1) bits each can be transmitted to a common destination. What is the optimal data compression for statistical inference? A simple idea is to send the first <i>k</i> letters of <i>x</i><sup>n</sup> and <i>y</i><sup>n</sup>. A simpler problem is the helper case where the optimal data compression of <i>x</i><sup>n</sup> is searched for under the condition that all of <i>y</i><sup>n</sup> are transmitted. It is a long standing problem to determine if there is a better data compression scheme than this simple scheme of sending first <i>k</i> letters. The present paper searches for the optimal data compression under the framework of linear-threshold encoding and shows that there is a better data compression scheme depending on the value of correlation. To this end, we evaluate the Fisher information in the class of linear-threshold compression schemes. It is also proved that the simple scheme is optimal when <i>x</i> and <i>y</i> are independent or their correlation is not too large.