Knowledge-infused and consistent Complex Event Processing over real-time and persistent streams

Emerging applications in Internet of Things (IoT) and CyberPhysical Systems (CPS) present novel challenges to Big Data platforms for performing online analytics. Ubiquitous sensors from IoT deployments are able to generate data streams at high velocity, that include information from a variety of domains, and accumulate to large volumes on disk. Complex Event Processing (CEP) is recognized as an important real-time computing paradigm for analyzing continuous data streams. However, existing work on CEP is largely limited to relational query processing, exposing two distinctive gaps for query specification and execution: (1) infusing the relational query model with higher level knowledge semantics, and (2) seamless query evaluation across temporal spaces that span past, present and future events. These allow accessible analytics over data streams having properties from different disciplines, and help span the velocity (real-time) and volume (persistent) dimensions. In this article, we introduce a Knowledge-infused CEP (-CEP) framework that provides domain-aware knowledge query constructs along with temporal operators that allow end-to-end queries to span across real-time and persistent streams. We translate this query model to efficient query execution over online and offline data streams, proposing several optimizations to mitigate the overheads introduced by evaluating semantic predicates and in accessing high-volume historic data streams. In particular, we also address temporal consistency issues that arise during fault recovery of query plans that span the boundary between real-time and persistent streams. The proposed -CEPquery model and execution approaches are implemented in our prototype semantic CEP engine, SCEPter. We validate our query model using domain-aware CEP queries from a real-world Smart Power Grid application, and experimentally analyze the benefits of our optimizations for executing these queries, using event streams from a campus-microgrid IoT deployment. Our results show that we are able to sustain a processing throughput of 3,000 events/secs for -CEPqueries, a 30 improvement over the baseline and sufficient to support a Smart Township, and can resume consistent processing within 20 secs after stream outages as long as 2hours. A semantic CEP model is introduced to query across real-time and persistent streams.The models analytic capability is illustrated using case studies from Smart Grid.Approaches to translate the model into scalable execution are discussed and evaluated.

[1]  Zhijun Zhang Ontology query languages for the Semantic Web , 2005 .

[2]  Avigdor Gal,et al.  Pattern rewriting framework for event processing optimization , 2011, DEBS '11.

[3]  Sebastian Rudolph,et al.  Stream reasoning and complex event processing in ETALIS , 2012, Semantic Web.

[4]  Nesime Tatbul,et al.  Efficiently correlating complex events over live and archived data streams , 2011, DEBS '11.

[5]  Srinath Perera,et al.  Wihidum: Distributed complex event processing , 2015, J. Parallel Distributed Comput..

[6]  Yogesh L. Simmhan,et al.  Prediction models for dynamic demand response: Requirements, challenges, and insights , 2015, 2015 IEEE International Conference on Smart Grid Communications (SmartGridComm).

[7]  Asaf Adi,et al.  Complex Event Processing for Financial Services , 2006, 2006 IEEE Services Computing Workshops.

[8]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[9]  Carlo Zaniolo,et al.  High-performance complex event processing over hierarchical data , 2013, TODS.

[10]  Alessandro Margara,et al.  Complex event processing with T-REX , 2012, J. Syst. Softw..

[11]  Hector Garcia-Molina,et al.  An Overview of Real-Time Database Systems , 1995, NATO ASI RTC.

[12]  N. Shadbolt,et al.  4store: The Design and Implementation of a Clustered RDF Store , 2009 .

[13]  Yogesh L. Simmhan,et al.  Semantic Information Modeling for Emerging Applications in Smart Grid , 2012, 2012 Ninth International Conference on Information Technology - New Generations.

[14]  Alain Biem,et al.  IBM infosphere streams for scalable, real-time, intelligent transportation services , 2010, SIGMOD Conference.

[15]  Elke A. Rundensteiner,et al.  Active Complex Event Processing over Event Streams , 2011, Proc. VLDB Endow..

[16]  GoldbergDavid,et al.  Continuous queries over append-only databases , 1992 .

[17]  Sharma Chakravarthy,et al.  SnoopIB: Interval-based event specification and detection for active databases , 2003, Data Knowl. Eng..

[18]  Michael Stonebraker,et al.  Aurora: a new model and architecture for data stream management , 2003, The VLDB Journal.

[19]  Jennifer Widom,et al.  Set-oriented production rules in relational database systems , 1990, SIGMOD '90.

[20]  Sang Hyuk Son,et al.  Design, Implementation, and Evaluation of a QoS-Aware Real-Time Embedded Database , 2012, IEEE Transactions on Computers.

[21]  Henry Huang,et al.  Archiving and Management of Power Systems Data for Real-Time Performance Monitoring Platform , 2005 .

[22]  Ugur Çetintemel,et al.  Plan-based complex event detection across distributed sources , 2008, Proc. VLDB Endow..

[23]  Ying Xing,et al.  The Design of the Borealis Stream Processing Engine , 2005, CIDR.

[24]  Jim Euchner Design , 2014, Catalysis from A to Z.

[25]  Yogesh L. Simmhan,et al.  Cloud-Based Software Platform for Big Data Analytics in Smart Grids , 2013, Computing in Science & Engineering.

[26]  Yogesh L. Simmhan,et al.  Improving Energy Use Forecast for Campus Micro-grids Using Indirect Indicators , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[27]  Scott Shenker,et al.  Discretized Streams: An Efficient and Fault-Tolerant Model for Stream Processing on Large Clusters , 2012, HotCloud.

[28]  Holger Ziekow,et al.  The DEBS 2013 grand challenge , 2013, DEBS.

[29]  M. Jones Process real-time big data with Twitter Storm An introduction to streaming big data , 2019 .

[30]  Martin Kersten,et al.  Exploiting the power of relational databases for efficient stream processing , 2009, EDBT '09.

[31]  Douglas B. Terry,et al.  Continuous queries over append-only databases , 1992, SIGMOD '92.

[32]  Daniele Braga,et al.  An execution environment for C-SPARQL queries , 2010, EDBT '10.

[34]  Srinath Perera,et al.  Siddhi: a second look at complex event processing architectures , 2011, GCE '11.

[35]  Alexander Schatten,et al.  Concepts and models for typing events for event-based systems , 2007, DEBS '07.

[36]  Fatos Xhafa,et al.  Special issue on cyber physical systems , 2013, Computing.

[37]  Matti Siekkinen,et al.  Power Management for Wireless Data Transmission Using Complex Event Processing , 2012, IEEE Transactions on Computers.

[38]  Alessandro Margara,et al.  Processing flows of information: From data stream to complex event processing , 2012, CSUR.

[39]  Sarvapali D. Ramchurn,et al.  Putting the 'smarts' into the smart grid , 2012, Commun. ACM.

[40]  Yogesh L. Simmhan,et al.  Incorporating Semantic Knowledge into Dynamic Data Processing for Smart Power Grids , 2012, International Semantic Web Conference.

[41]  Johannes Gehrke,et al.  Cayuga: A General Purpose Event Monitoring System , 2007, CIDR.

[42]  Opher Etzion,et al.  Event Processing in Action , 2010 .

[43]  Kia Teymourian,et al.  Enabling knowledge-based complex event processing , 2010, EDBT '10.

[44]  Viktor K. Prasanna,et al.  Semantic Information Integration for Smart Grid Applications , 2011 .

[45]  Dan Suciu,et al.  Processing XML streams with deterministic automata and stream indexes , 2004, TODS.

[46]  Qiang Chen,et al.  Aurora : a new model and architecture for data stream management ) , 2006 .

[47]  Guillaume Blin,et al.  A survey of RDF storage approaches , 2012, ARIMA J..