Consider the following scenario: you are asked by your leadership to find dedicated time for threat hunting
activities within your network.

After some time, access to the shiny new tool of choice is granted and you are super excited to get
started. You log into the tool and are greeted with a lovely search bar; how do you proceed from here?

The tool presenting the blank search bar is undoubtedly powerful and feature packed. However, it is often
very difficult to chart a path forward and to harness these powerful features.

Let us put ourselves in the shoes of a SOC analyst who has been placed in front of this proverbial blank
search bar and has been given the task of finding threats proactively in the environment. Where would to begin to
tackle such a task?

The above question provides the basis for this post, guiding how to address the “blank search bar” problem.

Before diving into this topic, let us step back for a moment and provide some base definitions for what threat hunting means in the context of this blog
post.

What is threat hunting

Like many terms in the cybersecurity industry, the definition of threat hunting may change depending on the
context and who is providing the definition.

The seminal “TTP-Based
Hunting” paper by MITRE provides several early definitions of threat hunting:

A focused and iterative approach to searching out, identifying and understanding adversaries that have entered the defender’s networks

… the process of proactively and iteratively searching through networks to detect and isolate advanced threats that
evade existing security solutions.

the proactive detection and investigation of malicious activity within a network

A few key themes shake out from these definitions:

Iteration
Proactive
Detect and investigate

In plain language, rather than waiting for your EDR or tool of choice to flag malicious
activity, threat hunting warrants that “hunt teams” proactively search for and detect such threats where they are
investigated. The process is also iterative, meaning one hunt engagement can feed into another.

Threat hunting engagements can be kicked off through many “inputs” – be it a threat report, a hypothesis of
some kind, a newly released technique or just simply a hunch.

In this blog post, we will be focusing on hypothesis-based threat hunting, where we articulate a hypothesis
and aim to prove or disprove it using the data that are available to us.

Why the command line

Now that we have defined what threat hunting is, let us dive deeper into the command line and articulate
why this particular data item is a fantastic starting point for threat hunting journeys.

To do so, let us take a hypothesis-based threat-hunting approach to MITRE ATT&CK data.

Our hypothesis here is that by examining the data source component within the MITRE ATT&CK data set, we can
gain an idea of which data source applies to the most MITRE ATT&CK techniques and procedures.

In other words, we want to spend our valuable time looking at a data source that would – ostensibly –
surface the most threats.

We can begin crunching the MITRE ATT&CK JSON data with the following query:

Looking at the returned results, an interesting dynamic begins to bubble up:

We see that large chunks of our pie chart are represented by the “Command Execution” and “Process Creation”
categories – both of which have a command line data element.

Together, these represent almost thirty percent of the total data source components within the MITRE ATT&CK
framework.

It stands to reason then, that investing time and effort into hunting through these data sources is worth
the effort, as according to the data, they present the largest opportunity for finding malicious or at least
suspicious activity.

Of course, this analysis paints in very broad brush strokes and does not consider environmental specifics
which may warrant starting your threat hunting adventures with a different and perhaps more applicable data source.

Now that we have articulated what threat hunting is and why we focus on the command line, let’s dive into
some use cases!

Hypothesis One: Long command lines are malicious

Let us start with a simple hypothesis, that all long command lines in your environment are malicious.

We can come to this hypothesis by checking out any threat intel report, such
as this awesome overview of an IcedID infection by the DFIR Report.

We can start by getting a handle on our data with the following query:

 _index=sec_record_endpoint 
| length(commandLine) as command_line_length
| values(commandLine) by command_line_length
| sort by command_line_length desc

This query will search the normalized event index in the Sumo Logic Cloud SIEM platform and will display the length
of the command line, along with the command line value in descending order.

When prototyping your hunts, it is always nice to have a set of “malicious” data to test your hypothesis
against. This can mean running unit tests such as Atomic Red Team or knowing
when a penetration test or red team engagement took place so that you can ensure this date range is included in your
query.

In our lab environment, a long command line was executed as a “control” and utilizing a sixty-minute time
window, our results look like this:

This is awesome, because we now know that a command line that is over eight-thousand characters in length
is at least a little bit suspicious in our particular environment.

As a next step, we can create a Cloud SIEM
Signal that looks something like this:

We can then go ahead and configure
a custom insight to include this signal:

Hypothesis One Long command lines are malicious

The reason that custom insights are so powerful for threat hunting hypothesis testing is that they include
a comments and tag section and can be assigned to different members of the team.

Going back to our original search, once we broaden out the time frame of the query, we discover that our
hypothesis could be more solid.

Navigating back to the custom insight we looked at earlier, we can go ahead and add a comment, letting our
hunt team members know that this particular hunt needs some tweaking prior to becoming operationalized.

We can also go ahead and add relevant MITRE tags and assign the work to a particular user.

Now we have a centralized place to document our hunting efforts and track work progress, very cool!

Hypothesis two: Special characters in the command line are suspicious

Using tools like Daniel Bohannon’s Invoke-DOSFuscation we can test threat hunting
hypotheses that look at obfuscated characters in the command line.

For example, let’s say we wanted to obfuscate the “whoami” command in order to evade basic detections based
on this command line value.

Via Invoke-DOSFuscation, the command now looks something like this:

There are probably several ways that we can look for this activity, including a count of obfuscated
characters or even command line length like we covered.

Looking at another approach, we can also use qualifiers within our query to look for specific characters
which are usually not found in our command lines and flag only when all these conditions are met.

In query form, this looks like this:

_index=sec_record_endpoint AND metadata_vendor = "Microsoft"
| if(commandLine matches /(^)/,1,0) as carrot_match // match on a ^ character
| if(commandLine matches /(&)/,1,0) as concat_match // match on & character 
| if(commandLine matches /(%)/,1,0) as percent_match // match on % character
| where carrot_match = "1" and concat_match = "1" and percent_match = "1" // only return results if all three match
| fields carrot_match,concat_match,percent_match,commandLine

And our results:

We can also broaden our search out a little by looking at additional obfuscation characters and setting a threshold
for our match, something similar to:

_index=sec_record_endpoint AND metadata_vendor = "Microsoft"
| if(commandLine matches /(^)/,1,0) as carrot_match // match on a ^ character
| if(commandLine matches /(&)/,1,0) as concat_match // match on & character 
| if(commandLine matches /(%)/,1,0) as percent_match // match on % character
| if(commandLine matches /("")/,1,0) as quote_match // match on “” character
| if(commandLine matches /(;)/,1,0) as semicolon_match // match on ; character
| (carrot_match + concat_match + percent_match + quote_match + semicolon_match) as total_obfuscation
| where total_obfuscation >= 3
| fields commandLine,total_obfuscation

Once we are happy with our search, we can
follow the instructions to turn this scheduled search into a CSE Signal, once our execution occurs, a CSE
signal will trigger:

From here, we can go ahead and add this signal to our existing
insight or create a new custom insight to track our hunting activities.

Hypothesis three: I can detect mark of the web bypasses via the command line

At the start of this post, we crunched some MITRE ATT&CK data and saw how the command execution and process
creation data components link to more MITRE tactics and techniques than other data source components.

However, when perusing the MITRE page for “Mark of the Web” (MOTW) bypasses – as one does – a lightbulb
goes off!

We see that threat actors are packing their payloads within an ISO file to avoid having their payloads
tagged with the MOTW.

Our hypothesis here is that when Windows mounts an ISO file, it usually assigns it a drive letter other
than “C” and that when a file is executed from this drive, it would show up in our command line logs, and this is
something we can hunt for.

It should be noted that for this particular technique (T1553.005) MITRE only has “File” as a data source:

This is why the threat hunting process is critical to your overall security program and why the process is
iterative and proactive, as sometimes you need to go beyond existing frameworks to squeeze every ounce of value from
the data that you are attempting to wrangle.

In order to test our hypothesis and generate the relevant data, we can either craft a custom payload, or
use the T1553.005
atomic red team test.

For demonstration purposes, we will scope this particular hunt to Rundll32 executing a payload from an ISO
container of some kind.

Here, our detection logic will look similar to:

toLowerCase(commandLine) matches 
/("c:\windows\system32\rundll32.exe".)([^c]:\)/

Hypothesis four: I can detect enumeration across multiple operating systems via the command
line

In today’s networks, it is common to find a mix of operating systems in use.

Some folks may be using Macbooks and others Windows laptops with some Linux workstations thrown in the mix.

Threat hunting efforts utilizing telemetry stemming from three different operating systems can be a great
challenge, as the telemetry may come in different
formats with the command line values found in fields that are named differently across operating systems and
telemetry sources.

Sumo Logic’s Cloud SIEM product comes with various log mappings and parsers
that give users threat-hunting superpowers!

What this means in practice is that you are able to search for a command line value across multiple
operating systems as the field name is normalized.

Knowing that we can use one field name to search across multiple operating systems, we can craft a
hypothesis which states that discovery activity across multiple operating systems in a short period of time can be –
at least potentially – indicative of suspicious behavior.

Let’s look at a practical example of this, using the chain rule functionality in Sumo Logic
Cloud SIEM.

Our rule logic will look:

In the rule, we are looking for three distinct matches and are keeping things fairly simple, looking for
whoami execution on three different operating systems, the Microsoft vendor is for Windows logs, Laurel for Linux, and Jamf for macOS.

As a quick test, we can go ahead and issue the whoami command to our three systems, and a signal should
trigger:

Here we can see which hosts were involved, what users were involved, where the “whoami” processes spawned
from, as well as what command lines were used.

Hypothesis five: I can detect APTs with “whoami”

Starting again from MITRE ATT&CK and looking at the System Owner/User Discovery technique we notice an
interesting command line value within the procedure examples section:

What if we can flag every occurrence of this command in our environment?

This seems like a great idea and we prototype a quick search to validate our assumptions:

Unfortunately, a lot of results are returned and it is not clear which whoami is suspicious and which are
legitimate.

A certain command line value may or may not be malicious or suspicious on its own, but what if we can flag
a “whoami” execution from a user or machine that has not executed this command for a period of time?

This temporal element may be the missing piece of information that we need to prove or disprove our
hypothesis.

This use case lends itself well to our Cloud SIEM First Seen rule feature, which
removes the need for complex queries that look over huge amounts of data to mark first and last seen events.

Below is an example of this kind of rule,

Now, instead of alerting on a “whoami” execution, which can occur frequently in the environment, we add a
temporal and baseline element in order to increase the efficacy of the rule logic and raise either an alert or a
Signal which can be part of a broader Insight when this activity occurs for the first time in our environment since
the baseline period.

Hypothesis six: I can enhance my command line hunts with additional data

Earlier, when looking at our very first hypothesis, we posited that a long command line in our environment
is worthy of some kind of attention. We prototyped our hunt and found that in our environment, there is a mixed bag
of malicious and benign activity that utilize long PowerShell command lines.

Returning to our MITRE ATT&CK data source analysis, we noted that the command line features heavily in this
and takes up the number one and two positions. However, the “network traffic content” is a close third.

What if we could combine our command line and network traffic data?

We know that our long command line hunting logic works to find malicious activity, but we need to add some
parameters to this event in order to set it apart from more benign activity, we can do this by adding the following
parameters to our hypothesis:

Look for a long PowerShell command line
Followed by PowerShell making a network connection that is:
- Initiated (Outgoing)
- To an external IP address

Once again we can turn to Sumo Logic CSE
Chain Rules to accomplish this for us:

Once we perform some validation testing, our signal should trigger and look something like this:

This scenario neatly outlines how combining MITRE ATT&CK data source components, in addition to iterating
through threat hunting hypotheses – which includes execution, detection, rule modifications and iterations – all
combine in order to enhance the threat detection value provided by the telemetry that is being ingested by your
security tooling.

Conclusion

We began this blog by outlining the challenges that cyber security analysts and engineers face when being
tasked with crafting threat hunting engagements and activities for complex environments.

It is often difficult to decide where to begin such threat hunting activities, as there are so many
different strands of telemetry stemming from various cloud-based and on-premises systems. It is often very difficult
to find the proverbial needle in the haystack. Learn how a modern SIEM can help in our ultimate guide.

By examining the MITRE ATT&CK data closely, particularly the data source components, we can begin to
prioritize and organize our threat hunting efforts, spending more time on data points that are tied to the most
MITRE ATT&CK tactics and techniques. From here, we can drill down into the procedural level of these techniques and
look for this activity in our own environments.

The powerful cloud-native log analytics
platform offered by Sumo Logic provides us with the tools and features we need in order to take full advantage of
this extremely rich and complex data source, offering us normalization, UBA features, case management as well as an
extremely powerful search language that can all be used in tandem in order to find suspicious and malicious activity
in environments by utilizing the command line data source component.

Learn more about improving your security posture and threat
hunting with Sumo Logic.

BY SECURITY USE CASE

BY OBSERVABILITY USE CASE

BY INDUSTRY

BY COMPETITION

LEARN

ENGAGE

TRAIN

COMMUNITY

Threat hunting with Sumo Logic: The Command Line

Table of contents

What is threat hunting

Why the command line

Hypothesis One: Long command lines are malicious

Hypothesis two: Special characters in the command line are suspicious

Hypothesis three: I can detect mark of the web bypasses via the command line

Hypothesis four: I can detect enumeration across multiple operating systems via the command
line

Hypothesis five: I can detect APTs with “whoami”

Hypothesis six: I can enhance my command line hunts with additional data

Conclusion

Ten new and updated apps for securing and monitoring your environments

Kubernetes vs Docker: How to choose the right container solution?

Lessons from the 2025 Security Operations Insights report

BY SECURITY USE CASE

BY OBSERVABILITY USE CASE

BY INDUSTRY

BY COMPETITION

LEARN

ENGAGE

TRAIN

COMMUNITY

Threat hunting with Sumo Logic: The Command Line

Table of contents

What is threat hunting

Why the command line

Hypothesis One: Long command lines are malicious

Hypothesis two: Special characters in the command line are suspicious

Hypothesis three: I can detect mark of the web bypasses via the command line

Hypothesis four: I can detect enumeration across multiple operating systems via the command line

Hypothesis five: I can detect APTs with “whoami”

Hypothesis six: I can enhance my command line hunts with additional data

Conclusion

People who read this also enjoyed

Ten new and updated apps for securing and monitoring your environments

Kubernetes vs Docker: How to choose the right container solution?

Lessons from the 2025 Security Operations Insights report

Hypothesis four: I can detect enumeration across multiple operating systems via the command
line