This topic describes the manner in which data is represented and processed in a Microsoft StreamInsight program. It is designed to give you familiarity with the basic concepts associated with complex event processing in StreamInsight. The topic begins by describing data structures and then describes the StreamInsight entities that act on or process the data.
StreamInsight works with three different types of data sequences:
pull - A pull data sequence is an ordered list of objects of the same type, implemented through the IEnumerable<> interface. A consumer of this sequence may access it iteratively using the LINQ-to-Objects framework. For more information, see LINQ to Objects.
push - A push data sequence is an ordered list of objects of the same type, implemented through the IObservable<> interface. The data source pushes the sequence to one or more data consumers that access the data using the Reactive Framework LINQ dialect. For more information, see Reactive Extensions.
temporal stream - A temporal stream is a sequence of events that has a unique set of characteristics. A temporal stream is implemented through the IStreamable<> interface and can be processed using StreamInsight LINQ.
A temporal stream is a specific type of data stream that is recognized by StreamInsight. It is a potentially unending sequence of data in which each event consists of a payload plus a time component that identifies the start and end time of the event. Examples include a stock ticker stream that provides the price of different stocks in an exchange as they change over time, or a temperature sensor stream that provides temperature values reported by the sensor over time.
A temporal stream has the following unique characteristics:
Each event in the stream has a timestamp.
The stream contains special events called Current Time Increment (CTI) events that indicate the completeness of the events in the stream up to that point in time.
Events in the stream observe the CTI model. That is, when a CTI appears in the temporal stream, no more events in the stream will have a timestamp earlier than the CTI timestamp.
The underlying data represented in a temporal stream is packaged into events. An event is the basic unit of data processed by StreamInsight. Each event consists of a header and a payload. An event header contains metadata that defines the event kind and one or more timestamps that define the time interval for the event. The payload is a .NET data structure that holds the data associated with the event. The fields defined in the payload are user-defined and their types are based on the .NET type system.
StreamInsight supports two event kinds: INSERT and CTI (current time increment). The INSERT event kind adds an event with its payload into the event stream, along with a start and end time for the event. The CTI event kind is a special punctuation event with a single field that provides a current timestamp. The CTI event indicates the completeness of the existing events in the stream up to that point in time, and it enables a query to accept and process events whose application timestamps do not correspond to their order of arrival into the query.
For more information about StreamInsight events, and how CTI events are used, see Event Structure.
A StreamInsight program creates and uses five basic types of entities: Source, Sink, Subject, Binding, and Process.
A Source is a data generator.
To support the composition of LINQ queries over sequences and temporal streams, StreamInsight exposes implementations of the IQueryable, IQbservable, and IQStreamable specializations of source types. These interfaces are defined in the System.Linq, System.Reactive.Linq, and Microsoft.ComplexEventProcessing.Linq namespaces respectively (for more information, see IQueryable, IQbservable, and IQStreamable).
A Sink is a data consumer. A sink may be an observer to an IObservable/IStreamable source or it may be an IStreamableSink to an IStreamable source.
Due to the nature of the IEnumerable contract, in which a consumer (enumerator) is obtained directly from the enumerable, there is no concept of an Enumerator sink.
A Subject is both a data producer and consumer. It also serves the purpose of sharing computation and state across multiple producers and consumers. A subject implements both the IObservable and IObserver interfaces, enabling it to both subscribe to observable sources and accept observer subscriptions.
Subjects do not support the IStreamable interface directly.
A Binding is an executable composition over sources, sinks, or subjects. It is the logic that is used to connect sources to sinks.
A Process is a named execution of a binding.