Personalized trip planning by integrating multimodal user-generated content

We address the problem of record linkage and semantic integration in the context of large collections of user-generated content. These datasets are often large since it contains the contributions of millions of Internet users. We present an approach based on approximate string matching between the metadata associated with such data. The discovered linkages are stored in an ontology for answering queries on the integrated data sources. We demonstrate this approach in Photo Odyssey, an interactive web application which integrates multimodal content from image hosting and travel websites to create a user interface with a graphical trip plan and personalization options.We discuss several practical challenges faced in building such an application - integrating and mining large-scale multimodal user-generated data, resolving semantic heterogeneity, and machine learning for matching and ranking items. Photo Odyssey operates in an online manner without using any previously stored knowledge base. We also describe methods to compute relevance of images, remove bad data instances and duplicates, perform contextual filtering, and assign a category to uncatalogued images which enable an interactive application even on Big Data with real-world characteristics.

[1]  Marcel J. T. Reinders,et al.  Using flickr geotags to predict user travel behaviour , 2010, SIGIR.

[2]  Vikrambhai S. Sorathia,et al.  The process-oriented event model (PoEM): a conceptual model for industrial events , 2014, DEBS '14.

[3]  Ickjai Lee,et al.  Mining Frequent Trajectory Patterns and Regions-of-Interest from Flickr Photos , 2014, 2014 47th Hawaii International Conference on System Sciences.

[4]  Ross Purves,et al.  Exploring place through user-generated content: Using Flickr tags to describe city cores , 2010, J. Spatial Inf. Sci..

[5]  Mor Naaman,et al.  How flickr helps us make sense of the world: context and content in community-contributed media collections , 2007, ACM Multimedia.

[6]  John Krumm,et al.  User-Generated Content , 2008, IEEE Pervasive Comput..

[7]  Mor Naaman,et al.  Towards automatic extraction of event and place semantics from flickr tags , 2007, SIGIR.

[8]  Ahmed K. Elmagarmid,et al.  Duplicate Record Detection: A Survey , 2007, IEEE Transactions on Knowledge and Data Engineering.

[9]  Bob J. Wielinga,et al.  Ontology-Based Photo Annotation , 2001, IEEE Intell. Syst..

[10]  Susanne Boll,et al.  Semantic photo books: leveraging blogs and social media for photo book creation , 2011, Electronic Imaging.

[11]  Cong Yu,et al.  Constructing travel itineraries from tagged geo-temporal breadcrumbs , 2010, WWW '10.

[12]  Josep Blat,et al.  Digital Footprinting: Uncovering Tourists with User-Generated Content , 2008, IEEE Pervasive Computing.

[13]  Min-Chun Hu,et al.  TravelBuddy: Interactive Travel Route Recommendation with a Visual Scene Interface , 2014, MMM.

[14]  George Buchanan,et al.  Tip: Personalizing Information Delivery in a Tourist Information System , 2009, J. Inf. Technol. Tour..

[15]  Matthew Lease,et al.  On Quality Control and Machine Learning in Crowdsourcing , 2011, Human Computation.

[16]  William W. Cohen,et al.  A Comparison of String Metrics for Matching Names and Records , 2003 .

[17]  Tat-Seng Chua,et al.  Mining Travel Patterns from Geotagged Photos , 2012, TIST.

[18]  Changhu Wang,et al.  Photo2Trip: generating travel routes from geo-tagged photos for trip planning , 2010, ACM Multimedia.

[19]  Jan Holmström,et al.  2014 47th hawaii international conference on system sciences (hicss), Hilton Waikoloa, Big Island, Hawaii, January 6-9. , 2014 .

[20]  Natalya F. Noy,et al.  Semantic integration: a survey of ontology-based approaches , 2004, SGMD.