SESSION + Live Q&A

Streaming SQL Foundations: Why I ❤ Streams+Tables

What does it mean to execute robust streaming queries in SQL? What is the relationship of streaming queries to classic relational queries? Are streams and tables the same thing conceptually, or different? And how does all of this relate to the programmatic frameworks like we’re all familiar with? This talk will address all of those questions in two parts.

First, we’ll explore the relationship between the Beam Model (as described in The Dataflow Model paper and the Streaming 101 and Streaming 102 blog posts) and stream & table theory (as popularized by Martin Kleppmann and Jay Kreps, amongst others, but essentially originating out of the database world). It turns out that stream & table theory does an illuminating job of describing the low-level concepts that underlie the Beam Model.

Second, we’ll apply our clear understanding of that relationship towards explaining what is required to provide robust stream processing support in SQL. We’ll discuss concrete efforts that have been made in this area by the Apache Beam, Calcite, and Flink communities, compare to other offerings such as Apache Kafka’s KSQL and Apache Spark’s Structured streaming, and talk about new ideas yet to come. In the end, you can expect to have a much better understanding of the key concepts underpinning data processing, regardless of whether that data processing batch or streaming, SQL or programmatic, as well as a concrete notion of what robust stream processing in SQL looks like.


Speaker

Tyler Akidau

Engineer @Google & Founder/Committer on Apache Beam

Tyler Akidau is a senior staff software engineer at Google, where he is the technical lead for the Data Processing Languages & Systems group, responsible for Google's Apache Beam efforts, Google Cloud Dataflow, and internal data processing tools like Google Flume, MapReduce, and MillWheel....

Read more
Find Tyler Akidau at:

Location

Whittle, 3rd flr.

Track

Stream Processing in the Modern Age

Topics

Apache KafkaStream ProcessingSQLApache SparkApache BeamGoogleCloud DataflowDataEng

Share

From the same track

SESSION + Live Q&A Stream Processing

Next Steps in Stateful Streaming with Apache Flink

Come learn how Apache Flink is making stateful stream processing even more expressive and flexible to support applications in streaming that were previously not considered streamable. Over the last years, data stream processing has redefined how many of us build data pipelines. Apache Flink is...

Stephan Ewen

Committer @ApacheFlink, CTO @dataArtisans

SESSION + Live Q&A event sourcing

Drivetribe: A Social Network on Streams

Drivetribe is the world's biggest motoring destination, as envisioned by Jeremy Clarkson, Richard Hammond, and James May. Built on top of the Event Sourcing/CQRS pattern, the Drivetribe platform uses Apache Kafka as its source of truth and Apache Flink as its processing backbone. This talk aims...

Aris Koliopoulos

CTO @Drivetribe

Hamish Dickson

Backend engineer @Drivetribe

SESSION + Live Q&A Scale

Lessons From a ~Yearly Re-Write of a Data Pipeline

Every year, we’ve set ourselves a goal of dramatically improving the performance and efficiency of our core data pipelines. We’ve done this by re-writing, effectively from scratch, the streaming pipelines that are responsible for processing over 120,000 events per second to deliver realtime...

Jibran Saithi

Lead Architect @Qubit

SESSION + Live Q&A Stream Processing

Streaming Reactive Systems & Data Pipes w. squbs

Reactive libraries are nothing new to the JVM. Reactive Streams as an SPI has even made its way into Java 9. However, their uses within microservice components are still for relatively narrow purposes like service orchestration. But we think differently. Our whole presence and universe can be...

Akara Sucharitakul

Principal MTS, Architect @PayPal

Anil Gursel

Software Engineer @PayPal

UNCONFERENCE + Live Q&A Stream Processing

Stream Processing Open Space

View full Schedule