Chaos Engineering
Modern software-based services are implemented as distributed systems with complex behavior and failure modes. Chaos engineering uses experimentation to ensure system availability. Netflix engineers have developed principles of chaos engineering that describe how to design and run experiments.
Chaos Engineering, in InfoQ. Retrieved 2/24/2018. https://www.infoq.com/articles/chaos-engineering
Presentations
The Scientific Method for Testing System Resilience
Do you remember the Scientific Method from elementary school science class? It's time to dust off that knowledge and use it to your advantage to test your IT systems! In this session, you'll be re-introduced to the Scientific Method, and learn how Vanguard's software engineers and IT...
How to Test Your Fault Isolation Boundaries in the Cloud
Will my system keep working when a server fails? When a data center goes offline? When a service dependency is unavailable?Availability calculations for redundant components require that those components are independent and autonomous of each other. But modern day systems are complex, exhibiting...
Chaos Engineering Observability with Visual Metaphors
Observability is key in operating a system in production; it’s required during an incident, when an operator has to interrogate, inspect, and piece together what happened to avoid a similar event. In those scenarios, Chaos engineering and Observability are closely connected - providing...
Interviews
The Scientific Method for Testing System Resilience
Christina, what is the focus of your work these days?
Right now, my primary focus is the staffing, onboarding and subsequent education of site reliability engineers for Vanguard. So I handle everything from what it means to be a site reliability engineer in the day-to-day, what tools and technologies they'll need to be familiar with and how to best get them up to speed. But also on...
Read Full InterviewChaos Engineering Observability with Visual Metaphors
What is the focus of your work these days?
I am a Cloud Infrastructure Engineer at Google. Although I interact with partners, clients and sales teams, my work is very technical, my daily activities include implementing Infrastructure and AppDev solutions in GCP. Every day I am practicing and getting experience with DevOps, SRE, Application Development, Developer Operations,...
Read Full Interview