A Simple and Efficient Method to Handle Sparse Preference Data Using Domination Graphs: An Application to YouTube

The phenomenal growth of the number of videos on YouTube provides enormous potential for users to find content of interest to them. Unfortunately, as the size of the repository grows, the task of discovering high-quality content becomes more daunting. To address this, YouTube occasionally asks users for feedback on videos. In one such event (the YouTube Comedy Slam), users were asked to rate which of two videos was funnier. This yielded sparse pairwise data indicating a participant's relative assessment of two videos. Given this data, several questions immediately arise: how do we make inferences for uncompared pairs, overcome noisy, and usually contradictory, data, and how do we handle severely skewed, real-world, sampling? To address these questions, we introduce the concept of a domination-graph, and demonstrate a simple and scalable method, based on the Adsorption algorithm, to efficiently propagate preferences through the graph. Before tackling the publicly available YouTube data, we extensively test our approach on synthetic data by attempting to recover an underlying, known, rank-order of videos using similarly created sparse preference data.

[1]  Shih-Fu Chang,et al.  Learning with Partially Absorbing Random Walks , 2012, NIPS.

[2]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[3]  Hanqing Lu,et al.  Item group based pairwise preference learning for personalized ranking , 2014, SIGIR.

[4]  Tommi S. Jaakkola,et al.  Partially labeled classification with Markov random walks , 2001, NIPS.

[5]  James Bennett,et al.  The Netflix Prize , 2007 .

[6]  Ben Carterette,et al.  Using PageRank to infer user preferences , 2012, SIGIR '12.

[7]  Amit Sharma,et al.  Pairwise learning in recommendation: experiments with community recommendation on linkedin , 2013, RecSys.

[8]  Chun Chen,et al.  Music recommendation by unified hypergraph: combining social media information and music content , 2010, ACM Multimedia.

[9]  Arik Azran,et al.  The rendezvous algorithm: multiclass semi-supervised learning with Markov random walks , 2007, ICML '07.

[10]  Ronald Rosenfeld,et al.  Semi-supervised learning with graphs , 2005 .

[11]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[12]  Katrin Kirchhoff,et al.  A Comparison of Graph Construction and Learning Algorithms for Graph-Based Phonetic Classification , 2012 .

[13]  Shankar Kumar,et al.  Video suggestion and discovery for youtube: taking random walks through the view graph , 2008, WWW.

[14]  Sergei Vassilvitskii,et al.  Inverting a Steady-State , 2015, WSDM.