Principal Technical Product Manager
Principal Technical Product Manager
In one of my previous blogs I explained how important it is for a modern observability platform to provide “the observers” full, flexible access to all raw telemetry. Observability’s promise to find unknown unknowns relied directly on the ability of fast, powerful and multidimensional high-cardinality analysis of raw data, to uncover previously unknown patterns that have not yet been visualized as a metric, dashboard panel or an alert or anomaly event.
Cloud-native and serverless come hand in hand. One of the initial motivations to move business workflows to the cloud was related to cutting costs related to provisioning infrastructure and elasticity that on-demand allocation of resources is offering. The serverless approach takes this to the next level, where infrastructure is provisioned only for the time of code execution, and the whole stack below the executed code, including application components, OS, and hardware (of course) is provided by the cloud vendor. No surprise this approach takes more and more traction, although it’s nothing new.
Over a year ago we decided to invest heavily in Application Observability, understanding the modern observability platform must unite logs, metrics, and traces in one analytics layer to better serve reliability use cases. We have also advocated a modern trend to acquire tracing data via open source industry standards like OpenTelemetry without vendor lock-in.
With almost every blog you read about monitoring, troubleshooting, or more recently, the observability of modern application stacks, you’ve probably read a statement saying that complexity is growing as a demand for more elasticity increases which makes management of these applications increasingly difficult.
It’s been almost a year since I shared some thoughts about distributed tracing adoption strategies on this blog. We have discussed how different approaches between log vendors and application performance management (APM) vendors exist in the market and how important that is to allow users to analyze the data, including custom telemetry, the way they want.
We at Sumo Logic believe in an open, flexible, community-driven approach to collecting observability data. Those reasons are outlined in one of my recent blogs. In that publication, I share the belief that an application’s observability gains traction from the fact that telemetry signals are designed, composed, and produced by an application developer/vendor in compliance with industry standards, and are not a proprietary, black box component of the monitoring vendor.
This week Sumo Logic announced our new Observability Suite, which included the general availability of our distributed tracing capabilities as part of our Microservices Observability solution. This new solution provides end-to-end visibility into user transactions across services, as well as seamless integration into performance metrics and logs to accelerate issue resolution and root-cause analysis. In this blog, we’ll explore the new solution in detail.
I am spending a considerable amount of time recently on distributed tracing topics. In my previous blog, I discussed different pros and cons of various approaches to collecting distributed tracing data. Right now I would like to draw your attention to the analysis back-end: what does it take to be good at analyzing transaction traces? As mentioned in the blog above, one of the most important outcomes of adopting open source tracing standards is a freedom to choose the right analysis backend, as long as it supports these standards. So, what is the requirement list for a distributed tracing backend? What should it do and what are absolute must-haves? We have looked at many free, open source and commercial offerings on the market and found a few tools that are good here or there, but nothing would fully match a complete list.