Realtime Data Processing at Facebook

by Guoqiang Jerry Chen et al in ACM SIGMOD

ABSTRACT
Realtime data processing powers many use cases at Face-book, including realtime reporting of the aggregated, anonymized voice of Facebook users, analytics for mobile applications, and insights for Facebook page administrators. Many companies have developed their own systems; we have a realtime data processing ecosystem at Facebook that handles hundreds of Gigabytes per second across hundreds of data pipelines.
Many decisions must be made while designing a realtime stream processing system. In this paper, we identify very important design decisions that affect their ease of use, performance, fault tolerance, scalability, and correctness. We compare the alternative choices for each decision and contrast what we built at Facebook to other published systems.
Our main decision was targeting seconds of latency, not milliseconds. Seconds is fast enough for all of the use cases we support and it allows us to use a persistent message bus for data transport. This data transport mechanism then paved the way for fault tolerance, scalability, and multiple options for correctness in our stream processing systems Puma, Swift, and Stylus. We then illustrate how our decisions and systems satisfy our requirements for multiple use cases at Facebook. Finally, we reflect on the lessons we learned as we built and operated
these systems.  Read the full paper.

DCL: This is a good overview of design decisions at Facebook in building a realtime CEP system to meet their requirements. Thanks to Roy Schulte of Gartner for bringing it to my attention.  It is worth a read. Too bad Chen et al (9 authors in all) don’t seem to be aware that they are building a specialized CEP system!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.