Two Categories of Big Data
Today, organizations generate more data in ten minutes than they did during the entire year in 2003. Consequently, the year 2013 stands to generate roughly 50,000x the volume of data generated just ten years prior.
This rapid acceleration in the pace at which data is generated (Velocity) has created an explosion in the sheer quantity of data (Volume) that exists in the world today. Data is being generated and captured everywhere, from Internet activity, social media, IT infrastructure, sensors, energy meters, the human body, automobiles, and virtually everywhere else we look. We have reached a point where data growth has outpaced the ability of the International Bureau of Weights and Measures to define an appropriate metric prefix; the conversation has expanded beyond yottabytes.
Complicating this even further is the multitude of formats (Variety) this data is generated in. Structured, unstructured, and semi-structured data are just the beginning categories within which to define the innumerable types of data that exists today. Data generated from custom applications, newly emerging technologies, sensors, meters, come in all types and frequently a type never seen before.
Today’s Volume, Velocity, and Variety of data has given birth to the term “Big Data” and has spawned an entire industry revolving around how to create value from that data. This value can be broadly categorized into two categories, analysis of past (stored) events and analysis of currently occurring (real-time) events. Below are some use cases of each:
Analysis of stored Big Data
- Clothing retailers looking through a customer database to determine behavior patterns and optimize store layout
- Grocers looking through loyalty program databases to analyze customer preferences and deliver personalized offers
- Biotech firms accelerating the pace of scientific research by reducing the time to analyze DNA sequence data
- Energy providers analyzing historical usage and patterns to properly allocate energy-generating resources
- Online retailers looking at past customer behavior to determine drop-off rates and analyze shopping cart abandonment
Analysis of real-time Big Data
- Online services organizations rapidly identifying and resolving application errors before any negative customer impact
- Fortune 500 brands visually monitoring their business and IT health through up-to-the-minute dashboards displaying infrastructure status, revenue, and customer activity
- Consumer brands receiving alerts based on a current spike of negative sentiment in social media channels
- Insurance company capturing and analyzing real-time driver habits to set individualized policy premiums
- Financial services company maintaining strong security posture by continuously looking for anomalies in currently occurring network activity
From these use cases, it is clear the two categories of analyzing stored data and real-time data share the similar goal of extracting value from data. However, digging deeper, there are several significant reasons to recognize the solutions required to address each category as distinct.