Sumo Logic ahead of the packRead article
Complete visibility for DevSecOps
Reduce downtime and move from reactive to proactive monitoring.
How have monitoring tools evolved over the years? That’s a big question, and one that few people are capable of answering based on personal experience. Monitoring software has been around in one form or another since the early years of computing, and few people who are active in the profession today were working then. (If you are one of those people, more power to you.)
In this post, we survey the history of monitoring tools. To be sure, there is no way to even start to do justice to this topic in a single blog post. That would take an entire book. However, in the paragraphs below, I’ll cover the basics of how monitoring software has evolved along with computer systems over the decades. If I forget to mention your favorite monitoring tool, mea culpa. (I’m sure that some last-one-standing die-hard out there still swears by the CP/M DDT X command…)
More than anything else, the real importance of the history of monitoring tools lies not in the story of any specific tool (as interesting as that may be in some cases), but in the overall course of that history—where it has taken the software development/deployment community, and where it is likely to lead. So in this post, we’ll take a big-picture look at monitoring tool history, and in the process, touch on some of the key points and highlights.
It wouldn’t be inaccurate to characterize the mainframe (and early minicomputer) era as The Age Of (Almost) No Monitoring. Operating systems typically had some internal monitoring services (for managing such things as multiple users and virtual memory). Software-based monitoring tools in the contemporary sense of the term were primitive, producing output that consisted of little more than core dumps (in the event of a crash) and logs (if you were lucky).
This is hardly surprising, since most systems were batch-oriented, with very little real-time input or output. They were also typically under the physical and operational control of a small group of trained technicians who knew how to interpret (and respond to) the output lights on the control panel, which served as a hardware-based system for monitoring basic functionality and health.
Unix was, needless to say, pivotal when it came to moving operating systems away from batch processing and into the interactive/real-time world. And not surprisingly, it is with Unix that many of the first basic monitoring commands and tools (such as top, vmstat, fuser, and syslog) became available. Since at least the early 1990s, such fundamental monitoring components have become a standard part of both Linux and Unix.
It was also during the ‘90s that interactive, real-time monitoring tools became a standard part of most desktop operating systems. Performance Monitor/System Monitor became a standard part of 32-bit (and later, 64-bit) Windows starting with NT 3.1. By the late 90s, graphic monitoring tools were also included in most Linux/Unix desktop environments.
The ‘80s and particularly the ‘90s also saw the development of network monitoring tools such as nmon, MTRG, and Big Brother. While desktop monitoring tools could generally afford to focus on a single system and a single user, network monitoring tools faced a broader challenge. They had to keep track not only of the performance of the network hardware and network management software, but also the activities of multiple users.
This meant monitoring the health and performance of multiple physical communication interfaces, as well as the server hardware and system resources, while at the same time providing the kind of traffic data that would allow the system to adequately manage the user load.
In the late 90s, most monitoring tools in use had been developed on the assumption that they were going to be used to monitor a local area network or the equivalent, with a relatively limited number of users in a closely managed environment.
By the beginning of the 21st century, however, it was becoming apparent that the monitoring needs of websites and Internet-based services were not the same as those of a typical office LAN. This led initially to the development of a generation of monitoring tools (such as Cacti, Nagios, and Zabbix) that supported standard Internet protocols, could be used on multiple platforms, were often quite scalable, and typically had Web-based interfaces.
These tools, however, still generally focused on functional and performance metrics, with a strong emphasis on server and communication hardware and related issues. They extended the reach of older network monitoring tools, but they retained much of those tools’ basic nature. The first decade of the 21st century would see the growing need for a new kind of monitoring tool.
The basic challenge of the early 21st century was this: For more and more organizations, the Internet was no longer an alternate or optional outlet for doing business—It was now their main (and sometimes only) platform.
Along with all of the standard functional/performance issues (many of which could be and often were better handled by hosting services), the need arose to monitor a growing list of what were essentially business-related metrics. It was as important to know the sequence of traffic from one page (or an element within a page) to the next, the pattern of traffic over time, and the geographic source of that traffic as it was to know whether the server was handling the traffic adequately.
Even with functional monitoring, priorities were shifting. The failure or success of specific links could be crucial. Anomalous traffic on shopping cart or authorization pages could be the sign of a potentially catastrophic error—or a break-in. As business websites became stores, they had to be watched in the same way that you would watch a physical store, due to many of the same potential problems.
Both the nature and the volume of monitoring data changed as more business shifted to the Internet. More online customers and clients meant more customer data, and that growing quantity of data had to be analyzed, if it was to be of any use at all. Monitoring was becoming not just monitoring, but monitoring plus market-oriented analytics.
The next step, of course, was for online commerce to move to the cloud, and that is where we are today. The wholesale move into the cloud has radically transformed the nature of monitoring tools. In a cloud-based deployment, for example, there’s no need (and often no practical way) to monitor hardware-related issues (unless, of course, you’re the cloud service provider).
Performance is still important, but when you monitor performance in the cloud, you necessarily have to do so in the context of software and virtualized infrastructure. What you’re monitoring is strictly code performance, even at the infrastructure level.
Cloud platforms often provide a variety of monitoring tools and APIs for both functional/performance and market-related monitoring. For a typical high-traffic container-based website or application, this can result in a flood of monitoring data, and this flood is only going to grow larger.
In many ways, the real challenge of cloud-based monitoring has not been monitoring itself, but what to do with the data—how to sort it, organize it, analyze it, and present it in an easy-to-grasp manner to the people who have to make on-the-spot decisions based on that data.
Monitoring, in other words, is no longer just gathering and recording data. Monitoring is data aggregation, monitoring is filtering, monitoring is analytics, monitoring is decision-making, and monitoring is action. This is what services such as Sumo Logic provide—a clear pathway from raw data to rapid understanding and effective action. As the volume of monitoring data increases, aggregation, analytics, and dashboard tools will become not just a necessity, but a fundamental part of the software management toolkit.
We’ve come a long, long way from core dumps, and the journey may be only just beginning.
Reduce downtime and move from reactive to proactive monitoring.
Build, run, and secure modern applications and cloud infrastructures.Start free trial
Moving to the cloud offers more than economics; it comes with unique security challenges that on-premises solutions cannot address. In minutes, Cloud Infrastructure Security for AWS from Sumo Logic brings cloud-native security analytics to AWS cloud environments. Curated workflows, out-of-the-box dashboards and AI-driven anomaly detection help security personnel easily monitor cloud security posture and cloud configurations and manage cloud risk from a centralized platform.
In a perfect world, computers would function properly on the network at all times. There would be no issues with the operating system and no problems with the applications. Unfortunately, this isn’t a perfect world. System failures can and will occur, and when they do, it is the responsibility of system administrators to diagnose and resolve the issues. But where can system administrators begin the search for solutions when problems arise? The answer is Windows event logs.