Thursday, September 25, 2008

Digital Document Drama

There is a danger in our reliance on digital documents. Storage and retrieval of information from documents and media that are created entirely by computer and in digital format is a task that library sciences and application developers are constantly pushing to make progress on.

Think about this. If you create a document that embeds resources and information from other locations and sources, what happens to that document if those resources and locations are unavailable?

A very simple example would be if you created a blog post that used an image from another web site. If that other web site was busy, down or moved, your blog post can no longer carry that image. More complex would be an internal company document which draws data and figures from other servers on its network system in order to provide formulaic information real time.

How do you archive a document like that? Save it as static and version control it? And how much documentation really needs to be saved or archived.

Of course you never think too often about these things until you encounter a problem. My problem was that I assigned to my class for reading an overview paper I delivered as part of a panel at the 2nd Annual World Wide Web Conference in Boston in October of 1994. All they would need to do to access it would be to search for it on Google or find it in the archival servers of the web site.

The problem came up when the only thing Google returned for a search on it were citations from other papers. One web page listed out all the presentations from the conference with links to the papers, but the links encountered the deadly “404 file not found” error when clicked on. The web site from the NCSA no longer had the information from that conference, which, quite frankly, shocked me.

Not only had I assigned a reading that was not accessible, but I was upset that it was no longer publicly available. No problem, right. I would obviously have an online version myself from when I created it in the first place. Sure, 14 years ago. My current computer is not that old thank you. So, I went home and looked first to see if I had diligently copied it from one computer to the next as they replaced each other on the home office front. It was not in my documents folder. It was not in my old documents folder and it was not in my old documents from old computers folder.

So I went out to the garage and opened up the storage box of keepsakes from the days of Free Range Media. In a plastic disk holder box I found a 3.5 inch disk that had conference documents listed on the paper label.

I crossed my fingers that it was a disk format that would be easily read, and I held my breadth that I had written it in a word processor format that could be opened by my current office application.

Thank goodness I sent it to the conference organizers in HTML format. Primitive, clunky HTML format mind you, which was probably manually inserted before saving as a text file.

It opened.

The bio is from the first year of Free Range before I took over as President but I am not going to update anything. I did check spelling and grammar to make an edit or two, but I have republished "Publishing in the New Mass Medium: Creating Content on the Internet" to the web, where it can now be accessed by the students through the generous online document storage and access capabilities provided by Google.

For more information on protecting and storage documents for the long haul, here is a nice article from Storage Magazine.

image from Offsite Data Depot


WritersHairClip said...

I tend to save things on my igoogle documents. My online storage of photos from myspace albums endured three computer changeovers so hooray for that. My mom recently bought another gigantic external hard drive that can store her business on the computer just in case. Maybe look into that!

scott1223 said...

"However, my favorite figure, favorite because I love big numbers, comes from testimony before the US House of Representatives, Committee on Science, Space and Technology on March 23, 1993 by Vinton Cerf. Cerf estimates 100 million users in the foreseeable future."

My, how the stats have changed.

1,463,632,361 users in the world as of today.

Hopefully you have seen the press release from the EDB about Infoblox renewing its lease in Tacoma for another six years. Growing from sixty...