SESSION + Live Q&A
An Engineer's Guide to a Good Night's Sleep
As organisations look to empower engineers more, and embrace devops practices, we have seen the support role change quite a bit too. Developers are moving from being purely third line support, to working more collaboratively with engineers and operational staff. Also as we move to cloud native microservice solutions, the increased complexity and diversity of our production landscape means operational staff may well rely more heavily on the engineers, in particular out of hours.
I have spent the last 18 years working across a plethora of industries utilising a myriad of technology and approaches. From working on everything from trading applications to content enrichment APIs, I have seen a lot of approaches and processes try to help minimise operational support for developers.
In this talk, I will be exploring and discussing some of my top approaches and techniques to help reduce the risk of that dreaded 3am call! You will gain some practical insight into how to handle failure in today's more complex distributed microservice systems. This will include looking at approaches to resiliency, understanding your system, understanding the requirements for fault tolerance, and the developers' mindset necessary for this. I will be peppering this talk with real world examples, and an occasional war story along the way too.
Speaker
Nicky Wrightson
Ventures CTO @blenheimchalcot
Nicky has worked as an engineer for over 20 years over many industries. She is currently working as Ventures CTO for Blenheim Chalcot, a venture builder which believes in investing more than just funds but investing knowledge and experience, ideas and infrastructure to build new sustainable...
Read moreFind Nicky Wrightson at:
From the same track
Building Resilient Serverless Systems
In this brave new world of serverless, we entrust our vendors with keeping the infrastructure up and running. However, when even cloud behemoths like Amazon Web Services and Google Cloud have outages and failures, how can we build resilient systems? John Chapin explains how to use...
Johnathan Chapin
Cloud Technology Consultant with an expertise in Serverless Computing
Learning From Chaos: Architecting for Resilience
In this talk Russ Miles, CEO of ChaosIQ, will share how leading organisations are successfully adopting chaos engineering to encourage a mindset of "architecting for resilience". Through chaos engineering, architects are able to establish a true "learning system" where everyone is involved in...
Russell Miles
CEO of @chaosiqio
How Condé Nast Succeeds by a Culture That Embraces Failure
Systems architectures are increasingly diverse to serve the growing demands for scalability, fault tolerance, isolation, and extensibility. But the compromise is ever complex software to operate and maintain often with no single shared view of entire design. This is especially true with the...
Crystal Hirschorn
VP Engineering, Global Strategy & Operations @CondeNast
Amplifying Sources of Resilience: What Research Says
Building robust software systems means anticipating how failures may occur with components and subsystems and developing answers to the question: “What is needed for the design of systems that prevents or limits catastrophic failure?” Investing in, developing, and...
John Allspaw
DevOps/Resilience Engineering Thought Leader, Previously CTO @Etsy & Co-founder of @AdaptiveCLabs