Tag-based algorithms can predict human ratings of which objects a picture shows

Collaborative tagging platforms allow users to describe resources with freely chosen keywords, so called tags. The meaning of a tag as well as the precise relation between a tag and the tagged resource are left open for interpretation to the user. Although human users mostly have a fair chance at interpreting this relation, machines do not. In this paper we study the characteristics of the problem to identify descriptive tags, i.e. tags that relate to visible objects in a picture. We investigate the feasibility of using a tag-based algorithm, i.e. an algorithm that ignores actual picture content, to tackle the problem. Given the theoretical feasibility of a well-performing tag-based algorithm, which we show via an optimal algorithm, we describe the implementation and evaluation of a WordNet-based algorithm as proof-of-concept. These two investigations lead to the conclusion that even relatively simple and fast tag-based algorithms can yet predict human ratings of which objects a picture shows. Finally, we discuss the inherent difficulty both humans and machines have when deciding whether a tag is descriptive or not. Based on a qualitative analysis, we distinguish between definitional disagreement, difference in knowledge, disambiguation and difference in perception as reasons for disagreement between raters.

[1]  Carole A. Goble,et al.  The Semantics of Semantic Annotation , 2002, OTM.

[2]  Seyed M. M. Tahaghoghi,et al.  Modeling Human Judgment of Digital Imagery for Multimedia Retrieval , 2007, IEEE Transactions on Multimedia.

[3]  Oded Nov,et al.  What drives content tagging: the case of photos on Flickr , 2008, CHI.

[4]  Mor Naaman,et al.  Towards automatic extraction of event and place semantics from flickr tags , 2007, SIGIR.

[5]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[6]  Peter Mika,et al.  Ontologies are us: A unified model of social networks and semantics , 2005, J. Web Semant..

[7]  P. Schmitz Inducing Ontology from Flickr Tags , 2006 .

[8]  Stefanie N. Lindstaedt,et al.  On the Feasibility of a Tag-Based Approach for Deciding Which Objects a Picture Shows: An Empirical Study , 2009, SAMT.

[9]  Joseph R. Dominick,et al.  Mass Media Research: An Introduction , 1983 .

[10]  Roelof van Zwol,et al.  Flickr tag recommendation based on collective knowledge , 2008, WWW.

[11]  Sourav S. Bhowmick,et al.  Image tag clarity: in search of visual-representative tags for social images , 2009, WSM@MM.

[12]  Mor Naaman,et al.  Why we tag: motivations for annotation in mobile and online media , 2007, CHI.

[13]  K. Kurasaki,et al.  Intercoder Reliability for Validating Conclusions Drawn from Open-Ended Interview Data , 2000 .

[14]  Darren Gergle,et al.  Emotion rating from short blog texts , 2008, CHI.

[15]  Scott P. Robertson,et al.  Proceedings of the SIGCHI Conference on Human Factors in Computing Systems , 1991 .

[16]  Bernardo A. Huberman,et al.  Usage patterns of collaborative tagging systems , 2006, J. Inf. Sci..

[17]  Zahir Tari,et al.  On the Move to Meaningful Internet Systems. OTM 2018 Conferences , 2018, Lecture Notes in Computer Science.

[18]  Sriram Subramanian,et al.  Talking about tactile experiences , 2013, CHI.

[19]  Nicola Guarino,et al.  Conceptual analysis of lexical taxonomies: the case of WordNet top-level , 2001, FOIS.

[20]  Jacob Cohen,et al.  Applied multiple regression/correlation analysis for the behavioral sciences , 1979 .