Glossary

AIOps


A


B


C


D


E


F


G


H


I


J


K


L


M


N


O


P


Q


R


S


T


U


V


W


X


Y


Z

    What is AIOps (artificial intelligence operations)?

    Digital transformation has seen the large-scale expansion of web-based services in a hybrid cloud environment, creating significant observability challenges for IT operators and analysts charged with application performance monitoring and maintaining the security and operational efficiency of IT systems and user experience.

    An effective AIOps platform helps facilitate IT infrastructure monitoring by collecting and aggregating data from the network without human intervention. Data sources include event log files from servers, applications and other network endpoints. Capturing data from multiple previously siloed sources and integrating them into a single database makes it easier for machine learning algorithms to assess network characteristics and performance in real time.

    AIOps software can be configured to track specific service-level indicators (SLIs) for a given server or application. IT operators may conduct performance tests to establish a baseline for service level objectives (SLOs) and define acceptable thresholds for the ones they intend to prioritize. When an SLO breach is detected, AIOps software can perform an automated root cause analysis, determining why a problem occurred and implementing a solution, if available, to reduce the mean time to resolution (MTTR).

    AIOps software tools support the incident management process by automating incident response to routine alerts, significantly reducing IT operators’ time on mundane, low-value tasks. AIOps tools can also feed machine-enriched data directly into the incident management processes, acting as valuable data sources and analyses that drive IT improvements for end users. More recently, generative AI tools promise to significantly increase the value and effectiveness of an AIOps platform by summarizing actionable insights, including predictive analytics, delivering anomaly detection, root cause analysis and automated remediation.

    Real-time processing
    Real-time data processing allows for a balance to be struck between ITOps meeting performance optimization requirements and security analysts managing countermeasures. With artificial intelligence, enterprise IT organizations can effectively ingest and analyze large volumes of data at scale and in real-time. As a result, these organizations can identify anomalies and respond more quickly to security events that are picked up by their AIOps tool.

    Rules and patterns
    Artificial intelligence tools use rule application and pattern recognition algorithms to detect network events that warrant a response. They may even use machine learning algorithms that allow them to develop their own rules for detecting network anomalies based on training data sets. Rules and patterns are used to distinguish between network activity that is considered “normal” and that which is deemed “anomalous” to accelerate decision-making.

    Domain algorithms
    Domain algorithms are specific to an industry or IT environment, and their contents and structure are dictated by an IT organization’s unique goals and data. These algorithms define the specific operational goals that will be prioritized by artificial intelligence.

    Artificial intelligence and machine learning
    The defining feature of AIOps. Regarding AIOps technology, artificial intelligence implementations are geared towards the intelligent analysis of large volumes of data and the capability of in-depth analysis via mathematical models that correlate and parse through machine data to produce histograms, charts and visualization.

    Sumo Logic is a cloud-native, multi-tenant platform that helps IT teams quickly arrive at data-driven decisions that reduce the time to investigate and remediate security and operational issues. Sumo Logic’s Observability platform is built from the ground up as an integrated portfolio of capabilities for monitoring (what happened), diagnosis (where it happened) and troubleshooting (why it happened) across disparate telemetry and powered by our entity backend. Use Sumo Logic to:

    Collect and centralize – more than 175 integrations make aggregating data across the tech stack and down the telemetry pipeline easy. Sumo Logic is working toward a unified collection model that fits the OpenTelemetry standard.

    Monitor and visualize – customizable dashboards align teams by visualizing logs, metrics and performance data for full-stack visibility and reliable delivery.

    Search and investigate – real-time analytics to rapidly identify and resolve potential issues, detect and prevent breaches, and reduce compliance costs.

    Alert and notify – Machine-learning algorithms work 24/7 to send alerts if there’s an important event or problem to fix.

    With Sumo Logic’s patented artificial intelligence technologies, LogReduce and LogCompare, IT organizations can aggregate large volumes of logs, events, and time-series metrics, identify and predict anomalies in real-time, and deliver crucial security and operational data to where it can be used to guard against data breaches and optimize the customer experience.

    Learn more about artificial intelligence for log analytics in our guide.