Sumo Logic ahead of the packRead article
Complete visibility for DevSecOps
Reduce downtime and move from reactive to proactive monitoring.
A few years ago, our UX team created personas for Sumo Logic. The intention with the personas was to capture the mindset of our different users and to create a common vocabulary throughout our organization. A salesperson could walk into a room with a marketing professional and a designer, and say that she’d just gotten off the phone with a Melinda, and everyone internally would know who Melinda is and how she feels when using Sumo Logic.
We began defining our personas with qualitative research, which involved meeting with many of our customers and combing through the reams of feedback that we receive. We asked our customers all about their work lives, how they approached problem-solving, how they collaborated with their team, etc. We also synthesized NPS data and support tickets, to see where customers were running into problems. We derived three personas with all of this qualitative data: Andre, Melinda, and Kathy. At this time, Sumo was primarily a tool for monitoring and troubleshooting infrastructure problems, and our personas reflected this.
Sumo also has a fair amount of data illustrating what our customers are doing in the product. During one of our biannual hackathons, a cross-functional team analyzed this data and ran a K-means clustering algorithm to see the most common use cases. These use cases dovetailed nicely with the personas that we had derived qualitatively.
In a nutshell, Melinda is the one who is responsible for the uptime of her organization’s app or site. She’s likely to have a DevOps or SRE title, and she also has a horizontal view of the app or site, knowing which components are connected to which. She’s typically calm in the case of an outage, and she looks at each outage as an interesting puzzle to solve. Melinda is also a frequent Sumo Logic user, and has high competency with the query language.
As a contrast, Andre is likely to be focusing on one piece of his organization’s app or site. He’s more likely to have a developer or engineer title, and he has a vertical view of the app or site. He has awareness of the architecture, but he is deeply familiar with one or two areas. Andre is less likely to use Sumo frequently or to know the query language, and when he does need to use Sumo, he feels a bit more panic than Melinda does. His problems at that point are two-fold: there’s an issue he’s troubleshooting, and he has to learn to use Sumo to properly troubleshoot.
Kathy is our final persona. She represents our buyer, and her role is likely a VP or in the C-suite. She is mostly concerned with the efficiencies of Andre and Melinda, and how they’re working together. Kathy may use Sumo occasionally, but it’s likely to examine dashboards for KPIs and not for troubleshooting.
After socializing the personas, our next move in 2018 was to derive DevOps topologies. The theory was that different organizations would have different topologies, or ways of arranging their Andres and Melindas and Kathys. An enterprise with 20,000 people might arrange their teams of Andres and Melindas one way, whereas a start-up with five people might have one person representing both Melinda and Kathy. We were interested in determining these canonical topologies, and the goals of the exercise were:
After interviews with 25 customers, we’ve derived the following DevOps topologies.
In the Artisans & Soldiers model, we see the more traditional Dev and Ops silos. The Dev group is comprised of Andres, whose job it is to work as artisans and create features. The Ops group is all Melindas, who are tasked with ensuring reliability and troubleshooting when something breaks. They function as soldiers, protecting the castle of features that the artisans have created. There can be chafing between the artisans and the soldiers, as occasionally the artisans create features without reliability in mind, or the soldiers need more knowledge on the features to troubleshoot. Melindas are heavy users of Sumo Logic, whereas the Andres use it infrequently and are more likely to use it for debugging. In this model, Melindas and Andres are relatively static - an Andre is unlikely to become a Melinda.
Artisans & Soldiers
The second topology is the Meteorologists & Farmers model. In this model, the Meteorologists are the first ones into Sumo Logic. These Melindas ingest data and create content, in the form of saved searches or dashboards. When creating this content, they are making a best guess at what their Andres will need down the line for monitoring and troubleshooting. As meteorologists, they may create dashboards that show temperature and precipitation. Then there’s a handoff to the Andres, who are the farmers. These Andres will need to grow crops, in the form of product features, and they’ll also need to use content that the Melindas created, in order to ensure the success of their crops. At times, Melinda’s guesses about content are wrong, and Andres need to augment the content to better suit their needs. This is the largest pain within this topology - the handoff of content can be difficult, and Andres and Melindas both need to edit the content.
Meteorologists & Farmers
The final topology is the Librarian and the A-Team. The Librarian is a Melinda, who has a horizontal view of all the data within her organization. The Librarian is responsible for making sure all of this data is ingested into Sumo Logic, and then she organizes it for the A-Team. After that, the A-Team takes over. The A-Team is a mix of Andres and Melindas, and they are relatively confident and self-sufficient. They are responsible for both reliability and creating new features, and there’s fluidity between the personas, as Andres ramp up to become Melindas. The pain point within this topology is the data handoff. As this happens, the A-Team needs tools to quickly understand the data that’s been ingested. This allows them to begin monitoring and troubleshooting right away, as well as building content to support those activities.
Librarian & A-Team
After we determined these topologies and associated pain points qualitatively, we began exploring ways to find an organization’s topology without additional data from them. If we could predict an organization’s topology, we could go into conversations with that customer with an idea of their pain points and a plan for ameliorating those. We could also structure our new product development around these topologies and pain points, especially if certain topologies were growing more than others or more valuable than others.
The gold standard for determining an organization’s structure is typically email. If you can see who is emailing whom, it’s possible to derive the organization’s hierarchy and get a sense for the teams within the organization. However, access to an organization’s email is not within our powers.
In lieu of email, we’re currently exploring two approaches. The first approach is determining organizational hierarchy through content shared in Sumo Logic, and the second is determining it through data about the application structure.
Sumo Logic recently released Content Sharing, a capability that allows Sumo users to share content objects with each other. They can grant view, edit, and manage permissions to other people within their organizations, and we have data on how customers were doing this. Tim Huang, an amazing intern, used this data to model the connectedness of many of our customers.
The graphs below illustrate two of our customers’ sharing habits. The Melindas are noted in green, and the Andres are in yellow. Each edge between nodes indicates that a dashboard was viewed more than twice in the past 30 days. Each organization also has a connectedness rating, which is defined as the ratio of users who have an edge (ie, the number of connected nodes in the graph) divided by the total number of users in the organization.
Based on Tim’s work, we’d expect to see certain patterns for our topologies. For the Artisans & Soldiers, we’d expect to see few edges between Andres and Melindas. Most edges should be between Melindas, and the connectedness ratio would be low.
For Meteorologists & Farmers, as well as for the Librarian & A-Team, we would expect to see much higher degrees of connectedness. In both of these cases, a lot of collaboration and dashboard sharing is happening between Andres and Melindas. The distinguishing characteristics between these two topologies falls outside of the Content Sharing feature, and into who’s ingesting data and who’s editing content.
Outside of examining how customers are sharing data, the other avenue that we’re exploring is whether application structure mirrors organizational structure. This hypothesis is based on Conway’s Law, which states that “organizations which design systems . . . are constrained to produce designs which are copies of the communication structures of these organizations.” In order for a software module to function, multiple authors must communicate frequently with each other. Therefore, the software interface structure of a system will reflect the social boundaries of the organization(s) that produced it, across which communication is more difficult.
For our purposes, this means qualitative research on whether Conway’s law is true for our customers, and then examining whether we can collect enough information about our customers’ application structures to derive their organizational structure and DevOps topology.
We’ll update on this as we find the magic solution for determining an organization’s DevOps topology, and please send along thoughts if pertinent.
Also, we’re always interested in speaking to our customers and we invite you to sign up for a UX research session.
Reduce downtime and move from reactive to proactive monitoring.
Build, run, and secure modern applications and cloud infrastructures.Start free trial
Moving to the cloud offers more than economics; it comes with unique security challenges that on-premises solutions cannot address. In minutes, Cloud Infrastructure Security for AWS from Sumo Logic brings cloud-native security analytics to AWS cloud environments. Curated workflows, out-of-the-box dashboards and AI-driven anomaly detection help security personnel easily monitor cloud security posture and cloud configurations and manage cloud risk from a centralized platform.
In a perfect world, computers would function properly on the network at all times. There would be no issues with the operating system and no problems with the applications. Unfortunately, this isn’t a perfect world. System failures can and will occur, and when they do, it is the responsibility of system administrators to diagnose and resolve the issues. But where can system administrators begin the search for solutions when problems arise? The answer is Windows event logs.