One of the perils of the migration to digital format for books, magazines and newspapers is the threat to future generations of researchers. In fact, one researcher warns that if the current trend continues we could be headed for what he calls a “digital dark age,” according to Jerome P. McDonough, assistant professor in the Graduate School of Library and Information Science, at the University of Illinois at Urbana-Champaign. The problem is accessibility. Think about trying to play a VHS tape when there are only DVD players around. Several generations from now, much of the data we produce could be lost to inaccessibility. And, there’s a lot of digital content, some 369 exabytes at last count.
This has very real implications for sociologists and not just those of us interested in digital culture. A wide range of cultural products (think music or film) and large data sets (think GSS or the census data) are vulnerable to being lost in the “black hole” of inaccessibility. Part of the problem is proprietary software. Remember WordPerfect? Perhaps you don’t, but it was a word-processing software product popular about ten years ago. Today, no one’s using and few people have heard of it. If you get a file that’s saved as a WordPerfect document, chances are you won’t be able to open the file and whatever content is in there is effectively lost. McDonough argues that part of the solution to the threat of a “digital dark age” is open source software. So, for example, instead of using Microsoft Office’s proprietary “Word” program, if more people used OpenOffice (an open source word processing program), digital content would be less vulnerable to unintentional loss.
That’s only part of the solution, however, as digital content is also vulnerable to deliberate erasure:
“E-mail is a classic example of that,” he said. “It runs both the modern business world and government. If that information is lost, you’ve lost the archive of what has actually happened in the modern world. We’ve seen a couple of examples of this so far.” McDonough cited the missing White House e-mail archive from the run-up to the Iraq War, a violation of the Presidential Records Act.
The power to erase content, and along with it, important parts of the historical record is not new. This is something that sociologist Poulantzas warned about thirty years ago in his Political Power & Social Classes. The difference with digital content is that this sort of information-power-move is much easier to accomplish. Of course, some are taking note of this threat, and working on preservation through a variety of digital collections, but sociologists would be wise to take note of this trend.
Comments 4
Jon Smajda — October 29, 2008
The format issue is a big pet peeve of mine, especially within sociology where journals want submissions in Microsoft's proprietary Word format, especially with free, open alternatives like OpenOffice and, even better, plain text formats like LaTeX or, for that matter, HTML. I've not quite resorted to Richard Stallman-esque measures such as refusing to open Word attachments, but it is an important issue that's going to bite a lot of us eventually. I was cleaning out my old files awhile back and found a bunch of my undergrad papers written in Claris Works (an old Mac word processor) that are basically unrecoverable now. (Now this is no big deal in this case, but imagine I could shift my age & career path 10 years forward, I'd have all my graduate work, which I do want to keep, locked into a proprietary format.)
This "ease of data destruction" argument is a bit overblown though. I keep everything I can on my computer instead of paper files and keep both local backups but also off-site backups. My house could flood or burndown and I wouldn't lose any important data (including things like family photos, etc.). I'm not saying most people take these steps, but...it's a bit silly to talk about deliberate erasure of files (which can be replicated easily & infinitely) when the good old paper shredder or fire could take care of almost any data not so long ago.
Jessie — October 30, 2008
Claris! I lost some files to that format as well. ;-) There are some real archiving issues with digital data that we just don't have good solutions for yet. Open source is a good first step, but the house-fire scenario that you raise actually brings up another issue I didn't address in this post. Most of us (99.9% as a rough estimate) are vulnerable to losing all of our digital data if there were a catastrophic event like a fire. Take for example all those iTunes songs and collections. If you had all those on an old format, such as CD, and you lost them in a fire, you'd be able to collect the insurance money to cover the loss and get them replaced. However, insurance companies don't cover digital data - including music. I've talked to several friends who work in the insurance industry and they've described this as a "black hole" of coverage. So, unless you have all your digital data saved to a remote/offsite location, you're up for a big nasty surprise...you know, on top of your house burning down (god forbid and all that).
patchinprov — November 30, 2008
You can still open WP documents in Word. There's no guarantee that 'open source' software is going to be readable on 'closed' hardware in the future. Look at all the open source music formats. The proprietary format mp3 is the one supported by most current devices. The same issues came up during the Millennium, how can a time capsule survive? The NYTimes had a good set of options: a single revered object preserved by a museum or a cheap universally discarded item (such as a Coke can). My guess is that digital information from the 1960s to late 1980s is vulnerable, but with consolidation/monopolization of software and hardware since then by large corporations (Microsoft, Apple, Sony, etc.) there are enough junk copies to survive for the next century.
It would be nice if the ASA made Contexts affordable to the public, that way more hardcopies would be in circulation.
As for the Bush administration...well some things are better lost to the ages.
Jon Smajda — November 30, 2008
@patchinprov: The key point is that if files are in open source formats, assuming the documentation for those formats survives, then the barriers to them being read at some point in the distant future are technical, not legal. My point about plain text as the best format for preserving your work stands apart from even this though: no matter what happens with computing in the next few decades, computers will be able to read ASCII text. That's as safe a bet as there is with the future of computers.
It would be nice if the ASA made Contexts affordable to the public, that way more hardcopies would be in circulation.
Believe me when I say we agree & we're working hard on this problem.