Content outruns storage because humans store things capriciously and illogically
TECHSPLOITATION I'd like to propose a version of Moore's law, only related to the expansion of information instead of the speed of processors. My new law goes like this: the amount of information in the world is always expanding faster than the data storage systems available to capture it.
You see what I'm getting at? Every time you invent a new, groovier kind of storage the book, the reel of film, the cassette tape, the petabyte hard drive, the nanoflop crystal the information that people are creating out in the world expands to elude it.
I don't mean to say that information, or what Web nerds call content, is literally too big to fit on all of our nifty storage devices. In fact, it's likely that all of the content in the world, including every phone call and movie, could fit onto existing computers. About 10 years ago an information studies professor named Michael Lesk wrote a terrific essay about the expansion of electronic storage in which he estimated that "soon after the year 2000 the production of disks and tapes will outrun human production of information to put on them. Most computer storage units will have to contain information generated by computer; there won't be enough of anything else."
He's probably right about that. But note that he also predicts how all of that extra storage space will be devoted to computer-generated information. Now it has come to pass that one of the greatest storage systems of our time the Google server empire does contain mostly computer-generated information in the form of indexes and queries and tons of other crap that isn't really human readable. Of course, it also contains the World Wide Web, which is mostly human readable and human created. And yet I'll predict that the machine information on Google's servers will be stored much longer than the human stuff.
What my law gets at and let's call it Lesk's law, since he made me think of it is the way the totality of human content expands in a peculiar way that makes it not technically impossible to store but practically impossible.
Consider that content expands in the way that humans do in geographical space. Look at all of the storage space available in a city and you'll see immediately that there is technically enough room for everyone in that city to have a room to sleep. Yet some people have 50 rooms to sleep in, and some sleep 50 to a room. Or they don't get stored in rooms at all because they live in the park. Again, this isn't because there are literally not enough houses for them. It's a human peculiarity that we give lots of space to some people and very little to others.
So how does this apply to content storage? Well, just as Lesk predicted a decade ago, a lot of our content is now computer generated. And because that stuff tends to be the data that helps large companies make tons of money, it gets primo storage space. In addition, because that computer content is valuable, it gets backed up onto more storage, and that makes it easy to save over time. Content created by your human friends on Facebook well, if Facebook happens to lose that due to a server outage, there may not be a backup. That information is now lost not because we couldn't store it but because we didn't maintain it in a stored state.
I'm not saying computer-generated content is prized over human content, though that does seem to be true in many cases. I could have made the same argument about content that is created by famous or influential people, which gets excellent storage in a variety of formats with plenty of backups.
What Lesk's law says is that content outruns our ability to store it. And now we see that's not because content expands exponentially over limited time but instead because content must be maintained in order to be considered truly stored. Content that exists only on one hard drive is stored temporarily, but nowhere near forever. So while it may be true that we have the technical capability to hold all of human knowledge in our future nanoflop crystals, we never will.
Content outruns storage because humans store things capriciously and illogically. Plus, we never make backups until it's too late.
Annalee Newitz is a surly media nerd who almost lost this column because she forgot to save it until she was nearly done.