The AWS Cloud is no longer the future of information technology infrastructure, but rather a present day reality. As data growth continues to expand, organizations around the world are avoiding building, and in some cases, actively closing down, on-premises datacenters as paying for the total cost of ownership for such environments is becoming an unwieldy, or at the very least inefficient, use of capital. This trend can be observed with the increasingly rapid adoption of cloud services over recent years. According to the new Worldwide Semiannual Public Cloud Services Spending Guide from International Data Corporation (IDC), worldwide spending on public cloud services will grow at a 19.4% compound annual growth rate (CAGR)–almost six times the rate of overall IT spending growth–from nearly $70 billion in 2015 to more than $141 billion in 2019.
Some organizations worry about losing visibility into their workload when moving to the cloud. The reality is that when companies migrate to the AWS Cloud, they have the opportunity to leverage cloud-native services and tools that were designed specifically for the agility and scalability of the cloud, avoiding excessive cost, lengthy implementations, and the need to for additional internal IT resources to manage the platforms and the hardware. An example of this would be logging and monitoring services that were frequently considered too expensive or time consuming to utilize in an on-premises environment. Because the scalability of the AWS Cloud allows you to spin up new instances on-demand and leverage pay-as-you-go pricing, logging and monitoring has become not only more affordable, but more foundational than ever. Since logging and monitoring on AWS is less expensive and simpler to implement than on-premises, it is easier than ever to have complete coverage of your environment, meaning you don’t need to miss out on any data.
Related to logging and monitoring, one area of opportunity is machine data analytics. Service that leverage AWS services:
- Amazon Simple Storage Service (Amazon S3) – A secure, durable, and highly-scalable cloud storage service
- Elastic Load Balancing (ELB) – An AWS service that automatically distributes incoming application traffic across multiple Amazon Elastic Cloud Compute (Amazon EC2) instances
- Amazon CloudFront – A global content delivery network (CDN) services that accelerates delivery of your websites, APIs, video content, and other web assets
- AWS CloudTrail – A web services that records AWS API calls for your account and delivers log files to you
- Amazon Virtual Private Cloud (Amazon VPC) Flow Logs – An AWS feature that enables you to capture information about the IP traffic to and from network interfaces in your VPC
These, and other AWS services, generate machine data in the form of log files and time- series metrics that can be analyzed in real time to improve visibility and mitigate security risk. Amazon CloudWatch (a monitoring service for AWS Cloud resources and the applications on them) aggregates these logs for high-level monitoring and alerting in AWS workloads. AWS Partner Network (APN) Advanced Technology Partner and AWS Security Competency Partner Sumo Logic applies advanced analytics and machine learning to logs and time-series metrics allowing organizations to gain real-time, full-stack visibility into cloud and hybrid environments.
Sumo Logic does not require instrumentation and easily captures machine data from AWS. It pulls log files from from a variety of AWS services, including AWS CloudTrail and Amazon VPC Flow Logs, and centralized metrics from Amazon CloudWatch to provide continuous intelligence. This continuous intelligence can help companies accelerate the building, running, and securing of modern applications and enables them to achieve greater visibility intotheir workloads compared to an on-premises environment. Sumo Logic also supports cross-functional collaboration by correlating data from multiple data sources, showing data in the context of time-series metrics, thereby providing a common source of truth for monitoring and troubleshooting.
The Importance of Machine Data Analytics
Machine data is data generated automatically by the activity of a computer, application, or device. This machine-generated data often come in the form of logs and can contain immensely valuable insights about the application/infrastructure and its health. The biggest problem with harnessing machine data is the sheer volume of data being generated. Raw machine data contains billions, if not trillions, of log and metric data points and is increasing in quantity at an exponential rate. The volume and velocity of this data growth can be difficult for single-tenant analytics solutions to handle. Additionally, machine data can come in a variety of formats and can be structured, unstructured, or semi-structured:
- Structured data refers to data that resides in a fixed field within a file, such as a field in a relational database or a time-series metric such as CPU utilization. Structured data can be easily stored, retrieved and analyzed.
- Unstructured data refers to all those things that cannot be easily classified such as streaming data, videos, images, blogs, and wikis.
- Semi-structured data is a cross between the two. It lacks the strict data model of structured data but has tags or other markers that help you identify certain elements. Log files are a good example of semi-structured data.With this in mind, it is important to use a data analytics platform optimized to handle all types of machine generated data, including custom metrics.