Scoring Conference Submissions by Alternate Maximization of Likelihood

We address the problem of combining the subjective, confidence-tagged opinions of experts that independently review a set of like items, such as submissions to a scientific conference. The conventional approach of confidence-weighted averaging is improved upon by augmenting a probabilistic error model with bias and trust parameters that characterize the subjectivity of the referees'' quality and confidence judgments, respectively. The likelihood of the review data under this model is then optimized by alternate maximization (AM) with respect to item scores and referee parameters. Since conditionally optimal trust parameters cannot be calculated explicitly, we provide two iterative schemes for this purpose. In preliminary experiments the resulting generalized AM algorithm was found to be robust, efficient and effective. We are set to field-test it in the peer review process of the NIPS*98 conference.