Archive for May, 2010

The Data Singularity, Part II: Human-Sizing Big Data

Thursday, May 27th, 2010

“There are no more promising or important targets for basic scientific research than understanding how human minds… solve problems and make decisions effectively.” – Herbert Simon

In my previous post , I discussed the forces behind what I’m calling The Data Singularity. My basic thesis is that as information generating processes become more frictionless — as humans have been excised from information read-write loops — the velocity and volume of data in the world is increasing, and at an exponential rate.

But where we go from here? What are the consequences of living in an age where every datum is stored? Where are the bottlenecks, pain points, and opportunities? Which technologies are addressing these?

The upshot is this: a new class of tools are evolving for Big Data because traditional approaches can’t scale up. But these tools share a common goal: scaling down data, and making it human-sized. That’s the “reduce” part of MapReduce, the single statistic from analysis, or the hundred pixel line from one hundred million events.

What’s happening today isn’t entirely new, though. There were echoes of it decades ago, when surveillance satellites first began scanning the globe.

VI. How Satellite Data Paralyzed the CIA

Beginning in the early 1970s the CIA began relying more on global satellite reconnaissance imagery for its intelligence operations. But according to one history, this massive, rich data didn’t accelerate the pace of US intelligence: it slowed it down.

Why? Because confronted with this firehose, CIA leaders attempted to analyze every image, chase every half-formed hypothesis, simply because it was possible. The few good leads were washed out by the many mediocre. The CIA didn’t adjust their decision-making to this new scale, and they were drowned by it.

Many organizations are at a similar inflection point now, with access to massive, rich data about their customers or products. And, like like the CIA in the 1970s, they find themselves paralyzed by the possibilities.

VII. People Still Pull the Big Levers

That Big Data paralyzes human decision-makers matters, because humans still make the big decisions. When someone praises a company as being “data-driven”, I’d like to imagine that this is literally true: that the company is nothing more than a few server racks blinking & humming away, slinging bits and earning money.

But no such company exists. What “data-driven” really means is that the executives & employees use data as inputs for making decisions. Companies may be data-fueled, but they’re people-driven.

VIII. Human-sizing Big Data: Filter & Crunch
(more…)