SESSION + Live Q&A

Growing Resilience: Serving Half a Billion Users Monthly at Condé Nast

Serving over half a billion monthly customers while keeping service availability high is a monumental task. Condé Nast operates in nearly 40 countries and is better known for it’s portfolio of household brands such as Vogue, Wired, Vanity Fair, The New Yorker. Our globally distributed platforms run more than 15 Kubernetes clusters in more than 5 geographic regions, runs a multi-CDN / edge architectures, employs a micro services approach, multi-tenanted web applications, high throughput data streaming architectures just to outline some of the technical challenges that we implement and operate on a daily basis.
 
Many of us are facing these challenges - so how do we cultivate and grow our organisation's capacity to adapt to the unknown in ever fluid, dynamic socio-technical systems? In this talk I will outline how Condé Nast practices Chaos engineering, where this fits within the already established testing and verification ecosystem, and what emergent practices and tools are on the horizon. Observability is at the core of understanding and drawing inferences about our systems. We’ll dispel some myths about touted "Best Practices” for Observability, looking beyond to emergent trends in signals, metrics, and tracing. Last but not least, I’ll cover how to build up your organisation’s true superpower: Human Resilience.

Speaker

Crystal Hirschorn

VP Engineering, Global Strategy & Operations @CondeNast

Crystal Hirschorn is currently VP Engineering, Global Strategy & Operations at Condé Nast which is best known for its portfolio of global brands Vogue, Wired, Vanity Fair, The New Yorker and many more. She oversees a globally distributed engineering organisation and leading the technical...

Read more
Find Crystal Hirschorn at:

Location

Fleming, 3rd flr.

Track

Chaos and Resilience: Architecting for Success

Topics

Incident ManagementSite Reliability EngineeringResilient SystemsInterview Available

Share

From the same track

SESSION + Live Q&A Interview Available

Better Resilience Adoption through UX

Too often, attempts to bring resilience engineering to an organization fall flat. Perhaps there’s some initial interest, but that wavers under the crushing weight of JIRA queues and sprint reviews. The tools are there but no one’s using them.This session will go over three case...

Randall Koutnik

UI Engineer

SESSION + Live Q&A Interview Available

Preparing for the Unexpected

Convincing engineers to be on-call isn’t always straightforward. In 2019 the Customer Products group at the Financial Times set out to make their out of hours support process more sustainable after losing a number of people from their on-call team.In this talk you’ll discover how to...

Samuel Parkinson

Principal Engineer @FinancialTimes

SESSION + Live Q&A Incident Management

How Many Is Too Much? Exploring Costs of Coordination During Outages

Service outages can attract a lot of attention from a wide range of participants - particularly when the service is for a business critical function. These ‘stakeholders’ represent multiple roles with different experience, responsibilities, expertise and knowledge about how the system...

Laura Maguire

Cognitive Systems Engineer & Researcher

SESSION + Live Q&A Incident Management

Rethinking How the Industry Approaches Chaos Engineering:

In order to determine and envision how to achieve reliability and resilience that drive our businesses forward, organizations must be able to look back at past blunders unobscured by hindsight bias. Resilient organizations don’t take past successes as a reason for confidence. Instead, they...

Nora Jones

Senior Developer/ Engineer

View full Schedule