Limitless data storage and the future of history

Charlie Stross posits that we’re rapidly approaching a future where data storage is so cheap that everything — everything — will be recorded for posterity: “The storage requirement for a video stream and two audio streams, plus GPS location, is only about 10,000 Gb per year — which will cost about £10 by 2017.” Such recordings, he argues, will be “a gold mine for historians” who will “be able to see the ephemera of public life and understand the minutiae of domestic life; information that is usually omitted from the historical record because the recorders at the time deemed it insignificant, but which may be of vital interest in centuries to come.”

In response, Cory Doctorow asks: “Once everyone and everything is recorded forever, what will historians do for a living?”

My answer is: the same thing they’ve always done.

What Charlie envisions is only quantitively different from what historians face now. The only change would be the volume of historical data available for the historian to interpret.

Classical and medieval historians must necessarily deal with a limited textual record. With a known and finite source material, they’ve had to hone their interpretative skills — drawing new conclusions from old texts. It’s probably not an accident that the leading lights of the Annales school were historians of the medieval and early modern (1450-1750) periods.

Modern historians, on the other hand, generally face more source material than they can deal with. Social and economic historians take a sample of old census records; cultural historians limit their focus to certain source materials; diplomatic historians work with a particular fonds. One of the critiques of professional history as it has latterly been practiced is that it has lost sight of the forest for the sake of the trees: all monographs and no synthesis.

What Charlie envisions is simply the apotheosis of something already under way: electronic historical records. Not only electronic, but machine readable. The social historian of the post-UNIX era won’t have to pay some grad student to enter a sample handwritten records into a computer; he or she will be able to crunch the entire dataset. By the time that everything everyone sees and hears is recorded to crystal at the atomic level, as Stross suggests, I expect that visual and audio data will be machine readable as well.

You’d think that this would make the situation by modern historians even worse: even more data. But, ironically, the machine-readable nature of future historians’ data will make the sheer volume easier to process. If getting at the data is easier, it’ll also be easier to see the big picture. A huge amount of an historian’s time is simply spent finding and gathering data: eliminate this scut work, and — well, you’ll put graduate students out of a job, but you might open up more time for interpretation and synthesis.

Which is what historians ought to be doing in the first place.