Machine data, sometimes called machine-generated data, is the digital information that is automatically created by the activities and operations of networked devices, including computers, mobile phones, embedded systems, and connected wearable products. In a wider context, machine data can also include information generated by websites, end-user applications, cloud-deployed programs, servers, etc.
Machine data is rarely modified by humans, but it can be collected and analyzed. Machine data is created automatically, usually either on a fixed schedule or as a response to an event. Call logs from a telephony system, transaction records from an ATM machine or network logs that record any IP address that pings a given server all qualify as machine data.
The quantity of data that is generated and stored by human beings has been increasing exponentially on a yearly basis for some time now, spurred on by the proliferation of telematics technologies such as GPS, Wi-fi and mobile data networks, along with Radio Frequency Identification (RFID) and the Internet of Things (IoT). As an increasing number of enterprise organizations are beginning to leverage big data analytics and machine learning, there are growing opportunities to effectively analyze machine data alongside other enterprise data types to identify new perspectives and insights that can drive business decisions.
Machine data covers data from a wide range of sources, with the basic criteria that the data was generated automatically by software with essentially no human involvement. We already know that machine data can come from a variety of sources, including:
- Desktop computers, laptops, tablets, and mobile phones
- Servers and networks
- End-User applications
- Server or Cloud-deployed applications
- SIEM logs
- Financial Transactions
We should also consider machine data sources that can be described as "data about data". If you programmed a software application to analyze some data and make a secondary calculation about it, the results of that calculation could be considered machine data. If you used a software tool to analyze a set of data and make a prediction about it, that prediction would be considered machine data. Finally, if you used a software tool to look at aggregated machine data and make a decision based on the results, that decision could be considered a piece of machine data.
Automating tasks can result in the creation of machine data. A software tool that manages a manufacturing system might be able to issue commands to a machine on the manufacturing line before generating a status log that records whether the machine accepted and performed the command. The tool might also make decisions based on the result, like sending an automated alert to technicians when the status log indicates a malfunction.
A final significant category of machine data is Metadata. Metadata is data that is attached to an event to describe the conditions under which the event took place. For example, each time you take a picture on your phone camera, metadata about the photo is generated automatically, including the date when the picture was taken, the camera lens' aperture, exposure time, GPS location and more.
As machine-generated became increasingly common, it was initially rare to see mid-market or enterprise organizations leveraging this large and valuable trove of data. Over time, third-party solutions providers have created specialized software tools that allow businesses to process their machine data and put it to use. As a result, an increasing number of organizations are beginning to tap into their machine-generated data and use it to drive insights and action.
Machine Data is processed according to a model called the DIKW hierarchy, which stands for Data, Information, Knowledge, Wisdom.
Machine-generated Data is raw and fact-based, it usually provides a simple record of an event or the value of a specific parameter at a given time. Machine Data Analytics tools are used to track the data other time and to correlate it with additional machine-generated data and data from other sources. The addition of context to the data answers questions like:
- "Whose activities are described by this data?"
- "Where did this data come from?"
- "What does this data represent?"
- "When was this data collected?
Answering these questions contextualizes the data and turns it into Information.
The next step is to take the collected information and turn it into Knowledge. At the level of knowledge, we're starting to analyze, understand and develop insight into the relationships that exist within the data and what they tell us about the overall health of the system. Whether we're looking at the data from a service perspective (as with IT service intelligence software) or from a security perspective (as with IT security intelligence software), the goal is to use the data to make a concrete determination or prediction about something.
At the top of the pyramid, we've got Wisdom. To develop Wisdom, we have to take our developed knowledge and insights and apply them to the problem. Wisdom can be described as "knowing what to do and doing it".
Machine Data Analytics tools follow the basic DIKW framework to process machine data. First, the data is collected from a variety of sources on the network. Then, an AI application uses algorithms to sift through the data, identify trends and track changes. Next, the information is thoroughly analyzed and correlated across the system to generate new knowledge and insights. Finally, once the insights have been reported to the users, someone can take action on the insights to improve the status of the network.
Machine data is a hidden and underutilized resource for many organizations. The investigation and processing of machine data can drive a range of valuable capabilities, including:
Operations Analytics - Ensure that key services are running at their expected capacity so you can provide the service levels your customers expect.
Security Analytics - Capture machine data to proactively monitor your security posture and rapidly detect network intrusions and suspicious activity.
Business Analytics - Use machine data to understand how users are interacting with software applications, generate new business intelligence and make data-driven decisions about which new features and bug fixes to prioritize.
Sumo Logic's industry-leading, cloud-native machine data analytics platform delivers cutting-edge capabilities in operations, security and business analytics. With Sumo Logic, your organization can capture and aggregate data from more than 150 applications, monitor and visualize network event logs, metrics and performance data, and rapidly identify and resolve potential security incidents. Sumo Logic is the best way to start squeezing the maximum value out of your organization's automatically-generated machine data.
Complete visibility for DevSecOps
Reduce downtime and move from reactive to proactive monitoring.