SVALI - Stream VALIdator
This work was funded by European
Commission FP7 and SSF.
Industrial equipment, such as wheel loaders, hydraulic motors, or simulators generate data streams that engineers and monitoring systems analyze to detect and understand their behavior. The data volume in these data streams is often very high and there is need for advanced computations on the streams, for example to predict misbehavior in the near future. This requires substantial hardware resources and scalable stream processing.
Data Stream Management Systems, DSMSs, are general systems for processing continuous queries, CQs, over data streams that filter, analyze, and transform the data streams. A simple CQ can be: "give me the sensor id and the power consumption whenever the power is greater than 100W". CQs are expressed on a high and user oriented level using a data stream query language.
SVALI (Stream VALIdator) extends the distributed DSMS SCSQ with a new stream window data type, user defined incremental sliding window aggregate functions, non-delayed delivery of sliding window results, parallel stream validation functionality, and a client-server interface for CQs. To enable processing CQs over data streams from many different sources with very high volume, SVALI has a distributed architecture where many SVALI nodes can be started on different compute nodes. The distributed architecture enables processing of CQs in parallel without causing unacceptable delays by expensive computations.
For detecting abnormal behavior of equipment, one approach is to develop a mathematical model of the equipment behavior. For example, the following CQ detects abnormal behavior in some equipment: "given a power consumption model computing the theoretical expected power consumption at any point in time, give me the sensor id whenever the difference between the actual power consumption and the theoretical expected power on the average is greater than 10W during 1 second."
In other cases, when there is no such model predefined, a model can be learned based on observing sensor readings during training periods.
The SVALI system provides a general approach to define the correct behavior of the equipment either analytically or statistically. The following facilities enable validation of equipment based based on data streams from sensors:
- Using SVALI's model-and-validate functions, users can define and install their own analytical models inside the DSMS to validate correct behavior of the data streams. The models are expressed as side-effect free functions (formulas) over streamed data values.
- For applications where no theoretical model can be easily defined, the learn-and-validate functions allow SVALI learn while processing a stream a model based on observed correct behavior and then use that learned model for subsequent stream validation.
C.Xu, D.Wedlund, M.Helgoson, and T.Risch: Model-based Validation of Streaming Data, The 7th ACM International Conference on Distributed Event-Based Systems, DEBS 2013, Arlington, Texas, USA, June 29 - July 3, 2013.
S.Badiozamany, L.Melander, T.Truong, C.Xu, and T.Risch: Grand Challenge: Implementation by Frequently Emitting Parallel Windows and User-Defined Aggregate Functions, Proc. The 7th ACM International Conference on Distributed Event-Based Systems, DEBS 2013, Arlington, Texas, USA, June 29 - July 3, 2013.
S.Badiozamany and T.Risch: Scalable ordered indexing of streaming data, Third International Workshop on Accelerating Data Management Systems Using Modern Processor and Storage Architectures, ADMS'12, Istanbul, Turkey, 2012. Slides
The following persons have contributed to the implementation of the SVALI kernel: Sobhan Badiozamany, Lars Melander, Tore Risch, Thanh Truong, and Cheng Xu.
Responsible for the project is Tore Risch and Cheng Xu is the
main technical contributor.