Unsupervised Learning with Non-Ignorable Missing Data

In this paper we explore the topic of unsupervised learning in the presence of nonignorable missing data with an unknown missing data mechanism. We discuss several classes of missing data mechanisms for categorical data and develop learning and inference methods for two specific models. We present empirical results using synthetic data which show that these algorithms can recover both the unknown selection model parameters and the underlying data model parameters to a high degree of accuracy. We also apply the algorithms to real data from the domain of collaborative filtering, and report initial results.