W. Roy Schulte
The opinions expressed here are my own and may not represent the position of my employer or any other company.
Several labels are used to describe software products that process event streams and compute complex events, notably including:
• Event Stream Processing (ESP) platforms
• Event Processing Platforms (EPPs)
• Distributed Stream Computing Platforms (DSCPs)
• Complex-event processing (CEP) platforms or engines
These labels are not quite synonyms – there are subtle differences in their respective definitions. However, there is a huge overlap, so a product may meet the definition of two, three or even all four of the labels.
For most purposes, it makes sense to lump all of these into a single, combined product category. They all process event streams and therefore can be called “event stream processing” (ESP) platforms. However, it is sometimes useful to understand why the other labels were created, and the differences that exist, or once existed, between the categories.
Here is an explanation of the terminology, starting from the core definitions in the glossary elsewhere on the www.complexevents.com web site.
An event object (for example, a message) is a report of anything that happens (an event).
An event stream is a linearly ordered sequence of events (event objects arranged in some order, typically by time).
Examples of sources of event streams include telecommunication networks, smart electric grids, Twitter, Facebook, financial markets, news feed providers, web navigation tracking tools, weather and earthquake sensors, set top boxes, national intelligence sources, oil wells and oil drilling equipment, and sensors in moving vehicles and other machines in the Internet of Things (IoT).
Event stream processing (ESP) is computing on inputs that are event streams.
An ESP platform is a software system that performs ESP. EPPs, DSCPs and CEP platforms are all ESP platforms because their input is event streams.
A complex event is an event that summarizes, represents, or denotes a set of other events.
A complex event is more abstract and generally more useful for decision-making purposes than the unprocessed, base events that may be used to generate that complex event.
CEP is defined as computing that performs operations on complex events, including reading, creating, transforming, abstracting, or discarding them.
Most – but not all – ESP involves CEP. Very simple ESP applications, such as loading event streams into databases, may filter, syntactically transform and store each event individually and statelessly. That may not involve CEP. However, most ESP applications, including some whose only purpose is to load event streams into databases, generate complex events. For example, “There were 23 tweets about citizen data science in the last 15 minutes” is a rudimentary aggregate complex event that is generated by counting the tweets that contain those words received during the time window. More-sophisticated complex events are derived by correlating events from multiple steams and detecting patterns. For example “Event A occurs and Event B occurs but Event C does not occur within a ten minute window” specifies a complex event.
EPPs are ESP platforms that provide high-level programming models, including expressive event processing languages (EPLs) and built-in functions for event filtering, correlation and abstraction. EPLs with built-in analytical functions reduce the time and effort required to develop and modify CEP applications, particularly those that involve detecting complicated patterns in one or more event streams. Most of the established ESP products are EPPs, including Espertech’s Esper, IBM’s Operational Decision Management (ODM), Software AG’s Apama, Tibco’s BusinessEvents and Streambase, and numerous other products.
Confusingly, some DBMS-like EPPs in the early days of CEP were called “event stream processors” (ESPs) to distinguish them from rule- or scripting-language-based EPPs. ESP-EPPs applied declarative operators to groups of events in moving time windows, and often were based on SQL-like EPLs. By contrast, rule- and scripting-language-based EPPs applied logic to individual events in ways that enabled them to more easily express the logic to detect convoluted temporal or spatial patterns. Over time, many of the ESP-EPPs added pattern operators and rule- and scripting-based EPPs added stream operators, blurring the distinctions. The early use of the term “ESP” in ESP-EPPs is narrower than the general definition of ESP described above.
DSCPs are ESP platforms that provide explicit support for distributing computation across multiple nodes in a cluster, and adding more nodes when the volume of work ramps up.
This design is reminiscent of Hadoop’s MapReduce architecture, but is different because it runs continuously in near-real-time, whereas traditional Hadoop runs in batch on request. Examples of popular DSCPs include Apache Samza, Spark Streaming, and Storm. Open source DSCPs are typically programmed in Java, Scala or another general purpose programming language rather than with an EPL.
CEP platforms or engines are software systems that facilitate the implementation of CEP.
All ESPs, including all DSCPs and EPPs, have features that facilitate CEP so they are all CEP platforms. Admittedly, basic DSCPs that are programmed in Java or Scala don’t give the developer as much help as an EPP does in terms of high-level programming models and built-in functions. Nevertheless, most applications on DSCPs involve CEP. For example, the original version of Storm was developed at BackType in 2011 to read tweet streams to help enterprises understand their impact on social media. As explained above, even a simple count of tweets on topic is a complex event. With enough effort, an adroit developer can build very intricate CEP applications on Storm using Java or other programming languages. It is getting easier to build sophisticated CEP applications on Storm because of the availability of Trident (for handling state), Impetus’ StreamAnalytix, and other value-adding software facilities.
On the other hand, it was fairly difficult to scale the early EPPs up to handle high throughput workloads because they lacked built-in, DSCP-like support for parallel processing. Some developers built EPP configurations that supported 100,000s of events per second by manually partitioning the work across multiple engines running on several multi-core processors in a cluster, but this required extra effort. It is getting easier to build scalable CEP applications on EPPs as vendors add better clustering and DSCP-like parallel processing capabilities to their products.
DSCPs and EPPs face the same inherent challenges and laws of physics when scaling up. Work can be partitioned horizontally relatively easily by linking a series of engines in sequence (a pipeline) and forwarding event streams through high-speed communication links. However, it can be much harder to partition the work vertically across multiple engines because some business problems require correlation across multiple streams and the logic is not data parallel (i.e., multiple processing elements may need to see the same incoming events). With any DSCP or EPP platform, developers have to know what they’re doing if they are designing an application that requires high throughput (10,000s, 100,000s, or more events per second) or low latency (less than a second or a millisecond).
A product can be both an EPP and a DSCP because EPLs and scalability architectures are orthogonal concepts. Indeed, the first known commercial DSCP product, IBM’s InfoSphere Streams (announced in 2009) is also an EPP because of its high level programming model and EPL (called Streams Processing Language (SPL)). A growing number of other established EPPs and some of the newer commercial ESP platforms, including DataTorrent Real-time Streaming (RTS) and SQLStream s-Server may also be both EPPs and DSCPs.
There are definitional differences between EPPs and DSCPs, but the products belong in one, combined category (ESP platform) because they play the same fundamental role: they process event streams and enable CEP. However, there are huge differences in operating characteristics among the individual products. Gartner Inc. is currently tracking 32 products in this field. Scalability, latency, ease of application development, analytic richness, testing, debugging, management, monitoring, failover, and recovery capabilities range from robust to weak among the products in this space. Some products are open source, some are not. Vendor support, price and track records also vary significantly. As with any software, the best choice for any particular application depends on the nature of the local business requirements.