SESSION + Live Q&A
Cloud-Native and Scalable Kafka Architecture
Kafka as a distributed stateful service faces serious stability and scalability challenges in cloud environment which favors stateless services. As cluster size grows with traffic, it faces issues of data balancing, high consumer data fan out and time consuming process to scale up or update. Failover is necessary to deal with cluster disasters but is hard to do right.
At Netflix, we address these issues by having many smaller and mostly “immutable” Kafka clusters which have limited state changes. We will prove the merit of this architecture in mathematical terms and illustrate how this architecture and additional tooling helps us to improve availability, scale and failover. Our Kafka service, which is composed of over 3000 brokers globally, is capable of processing over one trillion messages and petabytes of data per day with over 99.99% availability.
To make this multi-cluster architecture feasible, we also developed smart clients that conform to the standard Kafka producer/consumer interface but are capable of interacting with multiple clusters at the same time. Additional services are also created to orchestrate cluster/topic changes. The talk will go over the design principles of such clients and services.
Speaker
Allen Wang
Senior Software Engineer - Cloud Platform @Netflix
Allen Wang is currently with Netflix Real Time Data Infrastructure team where he made significant contribution to Kafka and data infrastructure in AWS. He is a contributor to both Apache Kafka and NetflixOSS and the author of Kafka's rack aware partition assignment. He spoke in 2016 and 2017...
Read moreFind Allen Wang at:
From the same track
Taming Distributed Stateful Pets With Kubernetes
So you've mastered Kubernetes for scheduling and scaling your stateless applications. Your pager has been quieter, life is good. But what about the carefully configured database clusters running on expensive dedicated infrastructure? (And the expensive sysadmin you're paying to maintain it!). In...
Matthew Bates
Co-founder at UK Kubernetes Company Jetstack
James Munnelly
Solutions Engineer @Jetstack
Scaling Uber's Elasticsearch Clusters
Uber's Marketplace is the algorithmic brain behind Uber's ride-sharing services, and the brain needs immense amount of real-time data to make timely and sound decisions. Uber's Marketplace Intelligence team has been using Elasticsearch as a real-time OLAP database to serve thousands of internal...
Danny Yuan
Real-time Streaming Lead @Uber
The Future of Distributed Databases Is Relational
Years ago when working at Amazon on shopping cart infrastructure and the precursor to DynamoDB, my co-founder and I realized that while distributed key value stores were useful for a few use-cases, we missed many of the benefits of relational databases: transactions, joins, and the power of the...
Sumedh Pathak
VP Engineering & Co-Founder @CitusData
Real-Time Decisions Using ML on the Google Cloud Platform
Ocado Technology is providing a full solution to put the world’s retailers online using the cloud, robotics, AI and IoT. Processing tens of thousands of orders every day, we generate millions of events every minute, leading to huge amount of data to be managed. We will present how this Big Data...
Carlos Garcia
Ocado Smart Platform Fraud Team Lead
Przemyslaw Pastuszka
ML Engineer @Ocado