QDaS: Quality driven data summarisation for effective storage management in Internet of Things

Abstract The proliferation of Internet of Things (IoT) has led to the emergence of enabling many interesting applications within the realm of several domains including smart cities. However, the accumulation of data from smart IoT devices poses significant challenges for data storage while there are needs to deliver relevant and high quality services to consumers. In this paper, we propose QDaS, a novel domain agnostic framework as a solution for effective data storage and management of IoT applications. The framework incorporates a novel data summarisation mechanism that uses an innovative data quality estimation technique. This proposed data quality estimation technique computes the quality of data (based on their utility) without requiring any feedback from users of this IoT data or domain awareness of the data. We evaluate the effectiveness of the proposed QDaS framework using real world datasets.

[1]  Yan Zhang,et al.  Optimal Incentive Design for Cloud-Enabled Multimedia Crowdsourcing , 2016, IEEE Transactions on Multimedia.

[2]  Rajkumar Buyya,et al.  Energy Efficient Resource Management in Virtualized Cloud Data Centers , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[3]  Derya Birant,et al.  ST-DBSCAN: An algorithm for clustering spatial-temporal data , 2007, Data Knowl. Eng..

[4]  Diane M. Strong,et al.  AIMQ: a methodology for information quality assessment , 2002, Inf. Manag..

[5]  Emiliano Miluzzo,et al.  A survey of mobile phone sensing , 2010, IEEE Communications Magazine.

[6]  Julita Vassileva,et al.  A Framework for Privacy-Aware User Data Trading , 2013, UMAP.

[7]  Valérie Issarny,et al.  Service-oriented middleware for large-scale mobile participatory sensing , 2014, Pervasive Mob. Comput..

[8]  Jian Lu,et al.  Crowdsourced smartphone sensing for localization in metro trains , 2014, Proceeding of IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks 2014.

[9]  Yangyong Zhu,et al.  The Challenges of Data Quality and Data Quality Assessment in the Big Data Era , 2015, Data Sci. J..

[10]  Arkady B. Zaslavsky,et al.  Sensing as a service model for smart cities supported by Internet of Things , 2013, Trans. Emerg. Telecommun. Technol..

[11]  Aman Kansal,et al.  Q-clouds: managing performance interference effects for QoS-aware clouds , 2010, EuroSys '10.

[12]  Alex Pentland,et al.  Reality mining: sensing complex social systems , 2006, Personal and Ubiquitous Computing.

[13]  Dazhong Wu,et al.  Cloud manufacturing: Strategic vision and state-of-the-art☆ , 2013 .

[14]  Mingjun Xiao,et al.  A QoS-sensitive task assignment algorithm for mobile crowdsensing , 2017, Pervasive Mob. Comput..

[15]  M. McHugh Interrater reliability: the kappa statistic , 2012, Biochemia medica.

[16]  Esma Aïmeur,et al.  When changing the look of privacy policies affects user trust: An experimental study , 2016, Comput. Hum. Behav..

[17]  Timos K. Sellis,et al.  Spatio-temporal Composition of Sensor Cloud Services , 2014, 2014 IEEE International Conference on Web Services.

[18]  Antonio Corradi,et al.  ParticipAct: A Large-Scale Crowdsensing Platform , 2016, IEEE Transactions on Emerging Topics in Computing.

[19]  W. D. Johnson,et al.  Intraclass Correlation Coefficient , 2006, International Encyclopedia of Statistical Science.

[20]  Shangguang Wang,et al.  Towards an accurate evaluation of quality of cloud service in service-oriented cloud computing , 2014, J. Intell. Manuf..

[21]  Kilian Q. Weinberger,et al.  Spectral Methods for Dimensionality Reduction , 2006, Semi-Supervised Learning.

[22]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[23]  Fan Ye,et al.  Mobile crowdsensing: current state and future challenges , 2011, IEEE Communications Magazine.

[24]  Tien-Dung Cao,et al.  MARSA: A Marketplace for Realtime Human Sensing Data , 2016, ACM Trans. Internet Techn..

[25]  Alastair R. Beresford,et al.  Device analyzer: large-scale mobile data collection , 2014, PERV.

[26]  Michael Mrissa,et al.  Exchanging Data Agreements in the DaaS Model , 2011, 2011 IEEE Asia-Pacific Services Computing Conference.

[27]  Prem Prakash Jayaraman,et al.  Internet of things: from internet scale sensing to smart services , 2016, Computing.

[28]  Stuart E. Madnick,et al.  Data quality requirements analysis and modeling , 2011, Proceedings of IEEE 9th International Conference on Data Engineering.

[29]  Wenfei Fan,et al.  Data Quality: From Theory to Practice , 2015, SGMD.

[30]  E. Massera,et al.  On field calibration of an electronic nose for benzene estimation in an urban pollution monitoring scenario , 2008 .

[31]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[32]  Bu-Sung Lee,et al.  Optimization of Resource Provisioning Cost in Cloud Computing , 2012, IEEE Transactions on Services Computing.

[33]  D. Aldous Exchangeability and related topics , 1985 .

[34]  Jens Lehmann,et al.  Quality assessment for Linked Data: A Survey , 2015, Semantic Web.

[35]  Rajkumar Buyya,et al.  Big Data computing and clouds: Trends and future directions , 2013, J. Parallel Distributed Comput..

[36]  Masamichi Shimosaka,et al.  Steered crowdsensing: incentive design towards quality-oriented place-centric crowdsensing , 2014, UbiComp.

[37]  Schahram Dustdar,et al.  On Evaluating and Publishing Data Concerns for Data as a Service , 2010, 2010 IEEE Asia-Pacific Services Computing Conference.

[38]  Adir Even,et al.  Utility-driven assessment of data quality , 2007, DATB.

[39]  Danilo Ardagna,et al.  Adaptive Service Composition in Flexible Processes , 2007, IEEE Transactions on Software Engineering.

[40]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[41]  Prem Prakash Jayaraman,et al.  Using On-the-Move Mining for Mobile Crowdsensing , 2012, 2012 IEEE 13th International Conference on Mobile Data Management.

[42]  Fanglin Chen,et al.  StudentLife: assessing mental health, academic performance and behavioral trends of college students using smartphones , 2014, UbiComp.

[43]  Luís Veiga,et al.  Quality-of-data for consistency levels in geo-replicated cloud data stores , 2013, 2013 IEEE 5th International Conference on Cloud Computing Technology and Science.

[44]  Ali Raza Butt,et al.  CAST: Tiering Storage for Data Analytics in the Cloud , 2015, HPDC.

[45]  Antonio Iera,et al.  The Internet of Things: A survey , 2010, Comput. Networks.

[46]  Diane M. Strong,et al.  Beyond Accuracy: What Data Quality Means to Data Consumers , 1996, J. Manag. Inf. Syst..

[47]  Sami Bhiri,et al.  Supporting Multi Data Stores Applications in Cloud Environments , 2016, IEEE Transactions on Services Computing.

[48]  Carlo Batini,et al.  Methodologies for data quality assessment and improvement , 2009, CSUR.

[49]  Pasi Tyrväinen,et al.  Role of acquisition intervals in private and public cloud storage costs , 2014, Decis. Support Syst..

[50]  Imad Aad,et al.  The Mobile Data Challenge: Big Data for Mobile Computing Research , 2012 .

[51]  Rajkumar Buyya,et al.  CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms , 2011, Softw. Pract. Exp..