Cloud Native Computing Foundation / Observability Practioners Summit 2019 (San Diego)

Add meeting Rate page Subscribe

Cloud Native Computing Foundation / Observability Practioners Summit 2019 (San Diego)

These are all the meetings we have in "Observability Practi…" (part of the organization "Cloud Native Computi…"). Click into individual meeting pages to watch the recording and search or read the transcript.

29 Nov 2019

Join us for Kubernetes Forums Bengaluru and Delhi - learn more at kubecon.io

Don't miss KubeCon + CloudNativeCon 2020 events in Amsterdam March 30 - April 2, Shanghai July 28-30 and Boston November 17-20! Learn more at kubecon.io. The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects

A Picture is Worth 1,000 Traces - Steve Flanders, Splunk & Yuri Shkuro, Uber Technologies

Distributed tracing has emerged as the go-to solution for understanding what’s going on in the ever-changing cloud native architectures. A single trace can reveal many things: network latencies, time spent in databases, a service spinning idly, etc. but finding the right trace among billions that demonstrates a problem in a large distributed application is very hard. By looking at traces in aggregate, we can eliminate the need to state and validate hypotheses and instead answers start to emerge naturally. Especially when we use creative visualizations that put our visual cortex to work without overloading it with useless information. This talk will present the power of aggregate analysis of distributed traces by highlight its applications beyond performance troubleshooting.
  • 2 participants
  • 32 minutes
microservices
microservice
ubers
architectures
providers
backends
deployments
complexity
devops
bottleneck
youtube image

29 Nov 2019

Join us for Kubernetes Forums Bengaluru and Delhi - learn more at kubecon.io

Don't miss KubeCon + CloudNativeCon 2020 events in Amsterdam March 30 - April 2, Shanghai July 28-30 and Boston November 17-20! Learn more at kubecon.io. The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects

Wrap-up
  • 9 participants
  • 28 minutes
discussions
researchers
conferences
university
having
contribution
thanks
expectations
70
ted
youtube image

29 Nov 2019

Join us for Kubernetes Forums Bengaluru and Delhi - learn more at kubecon.io

Don't miss KubeCon + CloudNativeCon 2020 events in Amsterdam March 30 - April 2, Shanghai July 28-30 and Boston November 17-20! Learn more at kubecon.io. The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects

Dynatrace Sponsored Session - Observability Where are We Headed? - Alois Reitbauer, Dynatrace

Observability is helping us to move beyond the traditional paradigm of monitoring. Companies are looking for more answers and gathering more data than traditional alerting can provide. On our journey we learned that simply having more data has just as many challenges. Ultimately, what you do with those insights from the data provides the value. Let’s look at some of these challenges and how they can be addressed.
  • 1 participant
  • 13 minutes
monitoring
observe
observability
observer
monitors
important
aware
analyzing
automation
assistance
youtube image

29 Nov 2019

Join us for Kubernetes Forums Bengaluru and Delhi - learn more at kubecon.io

Don't miss KubeCon + CloudNativeCon 2020 events in Amsterdam March 30 - April 2, Shanghai July 28-30 and Boston November 17-20! Learn more at kubecon.io. The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects

LeitMotif: An Abstraction for Debugging Distributed Applications - Mania Abdi, Northeastern University

Abstractions, such as APIs, allow developers to build complex distributed applications out of smaller building blocks. In contrast, there are very few abstractions available to limit the amount of complexity engineers must deal with when diagnosing problems in production applications. This mismatch means that diagnosis will continue to become more challenging as systems continue to scale. We present the workflow motif abstraction, instantiations of which capture frequent or important processing patterns observed in the workflow of requests. We argue that use of motifs can make existing diagnosis techniques more powerful and enable new use cases. We discuss features needed from distributed tracing infrastructures to generate useful motifs, progress on modifying frequent-subgraph mining algorithms to identify motifs from traces, and initial experiences using motifs to debug problems.
  • 4 participants
  • 26 minutes
abstractions
abstraction
debugging
debug
implementation
workflow
distributed
complexity
systems
virtualization
youtube image

29 Nov 2019

Join us for Kubernetes Forums Bengaluru and Delhi - learn more at kubecon.io

Don't miss KubeCon + CloudNativeCon 2020 events in Amsterdam March 30 - April 2, Shanghai July 28-30 and Boston November 17-20! Learn more at kubecon.io. The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects

LightStep Sponsored Session: Observability for Deep Systems - Spoons (aka Daniel Spoonhower), LightStep

Software architectures have evolved: applications are not just getting bigger but scaling deeper. Observabiliity tools must adapt to this new environment or leave developers with lots of responsibility but little control. I'll describe deep systems and where they came from as well as the opportunity that they have created for observability practitioners. All this in only 10 minutes!
  • 1 participant
  • 9 minutes
observability
observabillity
microservices
scale
understanding
important
huge
efficient
platforms
spoonhour
youtube image

29 Nov 2019

Join us for Kubernetes Forums Bengaluru and Delhi - learn more at kubecon.io

Don't miss KubeCon + CloudNativeCon 2020 events in Amsterdam March 30 - April 2, Shanghai July 28-30 and Boston November 17-20! Learn more at kubecon.io. The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects

New Relic Sponsored Session - Mike Panchenko, New Relic

At New Relic, we’re going all in with Kubernetes. That doesn’t just mean delivering features to customers that allow them to observe and monitor Kubernetes, but also, embracing Kubernetes as the defacto standard for orchestrating workloads running on the entire New Relic data platform.
This lightning talk will cover the trials and tribulations of planning and migrating New Relic’s massively scaled distributed database (a database that processes up to 1.5 billion data points a minute) to Kubernetes. You’ll learn how monitoring and observability as well as the tooling created for spreading our workloads out over many heterogeneous clusters has been critical for the success of the migration thus far. We will also share our perspective about what to expect and be prepared for in the future of this fast-growing space.
  • 1 participant
  • 10 minutes
observability
functionality
infrastructure
platforms
technical
relic
ization
kubernetes
information
manager
youtube image

29 Nov 2019

Join us for Kubernetes Forums Bengaluru and Delhi - learn more at kubecon.io

Don't miss KubeCon + CloudNativeCon 2020 events in Amsterdam March 30 - April 2, Shanghai July 28-30 and Boston November 17-20! Learn more at kubecon.io. The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects

Pythia: An Automated, Cross-layer Instrumentation Framework for Diagnosing Performance Problems in Distributed Applications - Emre Ates, Boston University

It is extremely difficult to understand where to enable instrumentation a priori to help diagnose problems that may occur in the future. We present Pythia, an automated cross-layer instrumentation framework, which explores the space of possible instrumentation choices and enables instrumentation needed to diagnose a newly-observed problem in production systems. Pythia builds on distributed tracing and uses statistical techniques to identify where instrumentation is needed. This talk will discuss 1) the scalable design of Pythia 2) our progress on identifying promising data structures to represent the instrumentation search space across multiple data center stack layers (e.g., application and kernel). These structures must trade-off between compactness, exhaustiveness, and accuracy. 3) Creating algorithms to search this space quickly while staying under a specific instrumentation budget.
  • 3 participants
  • 24 minutes
debugging
debug
workflow
implemented
automated
instrumentation
distributed
infrastructure
problematic
leveraged
youtube image

29 Nov 2019

Join us for Kubernetes Forums Bengaluru and Delhi - learn more at kubecon.io

Don't miss KubeCon + CloudNativeCon 2020 events in Amsterdam March 30 - April 2, Shanghai July 28-30 and Boston November 17-20! Learn more at kubecon.io. The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects

Real-Time Application Maps for Proactive and Actionable Visibility - Aloke Guha, OpsCruise

Today’s observability provides volumes of time-series data, statistical trends including anomaly detection and correlational analyses. We argue that operations teams need an integrated and cohesive understanding of the application that maps interdependencies across microservices and dependencies on the orchestration and infrastructure services. We show that beyond metrics, logs, and traces, capturing configuration information are necessary for creating a complete application maps for gaining deeper insights into the application behavior. In addition, establishing a standard approach capture the attributes of the complete application environment will enable automated detection and causal analysis of application problems. We will present some early findings on building real-time actionable application maps for cloud applications.
  • 5 participants
  • 26 minutes
ops
enterprise
capabilities
observability
challenges
visibility
proactive
deploy
entities
important
youtube image

29 Nov 2019

Join us for Kubernetes Forums Bengaluru and Delhi - learn more at kubecon.io

Don't miss KubeCon + CloudNativeCon 2020 events in Amsterdam March 30 - April 2, Shanghai July 28-30 and Boston November 17-20! Learn more at kubecon.io. The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects

Reliable Observability at Scale: Error Budgets for 1,000+ - Fred Moyer, Zendesk

"Observability and reliability engineering have been on a convergent course for several years. Error Budgets joined the reliability lexicon of engineering organizations in 2016 with the release of the SRE book. The intersection of observability and reliability has largely been the domain of specialists for practical implementation. How can one democratize these techniques to put them in the hands of a thousand engineers at once?

At Zendesk we developed simple algorithms and practical approaches for implementing SLIs, SLOs, and Error Budgets at scale using a number of observability tools. This talk will show the approaches developed and how we were able to manage observability instrumentation across dozens of teams quickly in a complex ecosystem (CDN, UI, middleware, backend, queues, dbs, queues, etc)."
  • 3 participants
  • 30 minutes
reliability
reliable
zendesk
monitoring
initiative
experts
000
observability
sres
budget
youtube image

29 Nov 2019

Join us for Kubernetes Forums Bengaluru and Delhi - learn more at kubecon.io

Don't miss KubeCon + CloudNativeCon 2020 events in Amsterdam March 30 - April 2, Shanghai July 28-30 and Boston November 17-20! Learn more at kubecon.io. The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects

SlackTrace: A New Tracing Tool - Suman Karumuri, Slack

Trace data contains very rich information about a request execution. However, current tracing tools only expose that information as a trace view or a service graph, which severely limits the questions we can ask of trace data and diminishes the utility of tracing. However, from past experience, we found that these limitations arise because unlike logs or metrics, we can’t query raw trace data.

To query raw trace data easily, we designed a new span format called SpanEvent and built our tracing infrastructure called SlackTrace around it. In addition, to presenting the trace data as a trace view and a service graph, the SpanEvent format allows us to query raw span data using SQL queries which allows us to derive rich insights from trace data that is not possible with existing tracing systems. In this talk, I will present SpanEvent format and an overview of our SlackTrace infrastructure.
  • 6 participants
  • 34 minutes
tracing
hosts
monitoring
twitter
pin
project
annotation
log
endpoint
intuitive
youtube image

29 Nov 2019

Join us for Kubernetes Forums Bengaluru and Delhi - learn more at kubecon.io

Don't miss KubeCon + CloudNativeCon 2020 events in Amsterdam March 30 - April 2, Shanghai July 28-30 and Boston November 17-20! Learn more at kubecon.io. The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects

Testing in A Distributed Systems World - Fernando Mayo, Undefined Labs

While microservices are becoming the norm due to advancements in development, deployment and monitoring techniques in the last few years, we are still using the same testing methodologies we used for monolithic apps. In this talk, we look at how distributed tracing can be applied to testing modern, distributed applications, from unit to end-to-end tests, to continuously give developers invaluable insight on how entire applications behave, and when and why they fail, before they are deployed to production. We'll also discuss the power of distributed context propagation and how it can be leveraged for testing purposes, from safely testing in production to failure injection.
  • 5 participants
  • 30 minutes
testing
tests
debugging
troubleshooting
monitoring
analyzer
assess
labs
process
risks
youtube image

29 Nov 2019

Join us for Kubernetes Forums Bengaluru and Delhi - learn more at kubecon.io

Don't miss KubeCon + CloudNativeCon 2020 events in Amsterdam March 30 - April 2, Shanghai July 28-30 and Boston November 17-20! Learn more at kubecon.io. The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects

Tracing is for Everyone, Not Just Backend Engineers. (How Tracing Could Help Front-end Engineers to Build a Better UX) - Nina Stawski, Omnition

There's been a lot of talk about the importance of observability and tracing for microservice-based applications. The usecases involved are usually focused on backend engineers and DevOps. But what about us front-end engineers? We also want to know how things work. More often than not, we get blamed first when something breaks, and it is important to understand the whole application, not just the front-end.

Currently, observability is not the top concern for front-end engineers, and I will show why it should be. In many cases, even if the application speed cannot be changed significantly, you can apply little tricks and add microinteractions to improve the UX. Besides, emerging tooling in OpenCensus and OpenTelemetry is easy to configure, enriches the existing data and helps developers to correlate traces between backend and UI.
  • 4 participants
  • 22 minutes
backends
services
insurance
splunk
performance
analytics
fronting
startup
conference
micro
youtube image

29 Nov 2019

Join us for Kubernetes Forums Bengaluru and Delhi - learn more at kubecon.io

Don't miss KubeCon + CloudNativeCon 2020 events in Amsterdam March 30 - April 2, Shanghai July 28-30 and Boston November 17-20! Learn more at kubecon.io. The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects

When Connections are Magic: Understanding Performance in Serverless - James Burns, LightStep

Observability! Cloud Functions! APIs! What could go wrong?! While researching the performance of object storage APIs there appeared to be custom run time magic happening leading to significant performance differences. Further research showed that it was *not magic* but lead to even more questions.

Working with modern systems means network connections, many of them. Understanding how those connections impact your customer's experience can be difficult. Distributed tracing helps isolate what parts of the system are failing, but when only implemented at the RPC level the reasons for and scope of network induced issues can be lost. See how network level insights can be integrated into distributed traces and hear how to effective practice iterative observability from the specific case of this research to a general framework for investigation.
  • 5 participants
  • 24 minutes
cloud
throughput
api
performance
services
observability
aws
backend
public
data
youtube image