SESSION + Live Q&A

Do You Really Know Your Response Times?

With the recent surge in highly available microsevervices with high incoming traffic, it is becoming more and more important to know how your service is performing right now and to be able to diagnose issues in production quickly. It took a while for us to understand how to produce meaningful graphs and alerts that help us truly understand our application performance.

We initially found that most developers did not understand what they were measuring and that many of the graphs caused confusion. In this talk I show how we collect application performance metrics at Sky.

I focus on the use of histogram metrics to monitor response times, explain how reservoir sampling can help and show the trade-offs among reservoir types. Finally I illustrate, with real-world examples, some good and bad practices when monitoring response times.



Speaker

Daniel Rolls

Collecting and Interpreting Large-Scale Data Collected @SkyUK

Daniel Rolls is a senior developer at Sky where he is responsible for building web services for over the top delivery of video streams. Prior to joining Sky Daniel did a PhD in Computer Science and worked for various organisations including Xerox and The University of Hertfordshire. Daniel...

Read more
Find Daniel Rolls at:

From the same track

SESSION + Live Q&A Observability

Avoiding Alerts Overload From Microservices

Microservices can be a great way to work: the services are simple, you can use the right technology for the job, and deployments become smaller and less risky. Unfortunately, other things become more complex. You probably took some time to design a deployment pipeline and set up self-service...

Sarah Wells

Former Tech Director for Engineering Enablement @FT (Financial Times)

SESSION + Live Q&A Serverless

Monitoring Serverless Architectures

Serverless architectures are attracting more and more interest from the IT professionals and companies hoping to lower the costs of creating and operating distributed systems without constant worrying about availability, scalability and capacity management. Despite all the attractive properties...

Rafal Gancarz

Lead Consultant @OpenCredo

SESSION + Live Q&A Observability

After Acceptance: Reasoning About System Outputs

Modern software development allows us to prove that new work is functionally complete. We write a set of executable specifications. We automatically execute them in the form of acceptance tests as part of our continuous delivery pipeline. When all the tests pass, we are done! This approach is...

Dr. Stefanos Zachariadis

Senior Software Engineer

SESSION + Live Q&A Observability

Observability, Event Sourcing and State Machines

What is a way to have complete transparency of the state of a service? Ideally we would record everything - the inputs, outputs and timings - in order to capture highly reproducible and transparent state changes. However, is it possible to record every event or message in and out of a service...

Peter Lawrey

CEO @Chronicle_SW

SESSION + Live Q&A Open Space

Observability Open Space

View full Schedule