Pricing Login
Back to blog results

January 23, 2013 By David Andrzejewski

Beyond LogReduce: Refinement and personalization

LogReduce is a powerful feature unique to the Sumo Logic offering. At the click of a single button, the user can apply the Summarize function to their previous search results, distilling hundreds of thousands of unstructured log messages into a discernible set of underlying patterns.

While this capability represents a significant advance in log analysis, we haven’t stopped there. One of the central principles of Sumo Logic is that, as a cloud-based log management service, we are uniquely positioned to deliver a superior service that learns and improves from user interactions with the system. In the case of LogReduce, we’ve added features that allow the system to learn better, more accurate patterns (refinement), and to learn which patterns a given user might find most relevant (personalization).


Users have the ability to refine the automatically extracted signatures by splitting overly generalized patterns into finer-grained signatures or editing overly specific signatures to mark fields as wild cards. These modifications will then be remembered by the Sumo Logic system. As a result, all future queries run by users within the organization will be improved by returning higher-quality signatures.


Personalized LogReduce helps users uncover the insights most important to them by capturing user feedback and using it to shape the ranking of the returned results. Users can promote or demote signatures to ensure that they do (or do not) appear at the top of Summarize results. Besides obeying this explicit feedback, Sumo Logic also uses this information to compute a relevance score which is used to rank signatures according to their content. These relevance profiles are individually tailored to each Sumo Logic user. For example, consider these Summarize query results:

Results before feedback

Since we haven’t given any feedback yet, their relevance scores are all equal to 5 (neutral) and they fall back to being ranked by count.


Now, let’s pretend that we are in charge of ensuring that our database systems are functioning properly, so we promote one of the database-related signatures:

Results after promote

We can see that the signature we have promoted has now been moved to the top of the results, with the maximum relevance score of 10. When we do future Summarize queries, that signature will continue to appear at the top of results (unless we later choose to undo its promotion by simply clicking the thumb again).

The scores of the other two database-related signatures have increased as well, improving their rankings. This is because the content of these signatures is similar to the promoted database signature. This boost also will persist to future searches.


This functionality works in the opposite direction as well. Continuing our running example, our intense focus on database management may mean that we find log messages about compute jobs to be distracting noise in our search results. We could try to “blacklist” these messages by putting Boolean negations in our original query string (e.g., “!comput*”), but this approach is not very practical or flexible. As we add more and more terms to our our search, it becomes increasingly likely that we will unintentionally filter out messages that are actually important to us. With Personalized LogReduce, we can simply demote one of the computation-related logs:

Results after demote

This signature then drops to the bottom of the results. As with promotion, the relevance and ranking of the other similar computation-related signature has also been lowered, and this behavior will be persisted across other Summarize queries for this user.

Implicit feedback

Besides taking into account explicit user feedback (promotion and demotion), Summarize can also track and leverage the implicit signals present in user behavior. Specifically, when a user does a “View Details” drill-down into a particular signature to view the raw logs, this is also taken to be a weaker form of evidence to increase the relevance scores of related signatures.


The signature refinement and personalized relevance extensions to LogReduce enable the Sumo Logic service to learn from experience as users explore their log data. This kind of virtuous cycle holds great promise for helping users get from raw logs to business-critical insights in the quickest and easiest way possible, and we’re only getting started. Try these features out on your own logs at no cost with Sumo Logic Free and let us know what you think!

Complete visibility for DevSecOps

Reduce downtime and move from reactive to proactive monitoring.


Sumo Logic Continuous Intelligence Platform™

Build, run, and secure modern applications and cloud infrastructures.

Start free trial

David Andrzejewski

David Andrzejewski is a senior engineering manager at Sumo Logic, where he works on applying statistical modeling and analysis techniques to machine data such as logs and metrics. He also co-organizes the SF Bay Area Machine Learning meetup group. David holds a PhD in Computer Sciences from the University of Wisconsin–Madison.

More posts by David Andrzejewski.