Eternal Bits - how Can We Preserve Digital Files and Save Our Collective Memory

ID: Smith (2005) PDF: (afstuderen:Smith (2005) Eternal Bits - how Can We Preserve Digital Files and Save Our Collective Memory.pdf|PDF)

===== Summary =====

While far from perfect, emulation as an approach to preservation has generated some creative ideas. One offshoot is the. Universal Virtual Computer. In 2000, Raymond Lorie and his research team at IBM’s Almaden Research Center, in San Jose, Calif., proposed writing a program that would act like a computer—complete with instruction set, architecture, operating system, and related utilities—to render any particular file format. So instead of copying files from an old format to a new one or rewriting emulations to match newer computer systems, programmers would just rewrite the Universal Virtual Computer as computer platforms evolve, and everything else would just work.

*It took two centuries to fill the U.S. Library of Congress in Washington, D.C., with more than 29 million books and periodicals, 2.7 million recordings, 12 million photographs, 4.8 million maps, and 57 million manuscripts. Today it takes about 15 minutes for the world to churn out an equivalent amount of new digital information. *It does so about 100 times every day, for a grand total of five exabytes annually. *ease with which we can create digital data, our capacity to make all these bits accessible in 200 or even 20 years remains a work in progress. (Probleem 4e dimensie) *[[DSpace]] … opensource software application that not only accepts digital materials and makes them available on the Web but also puts them into a data-management regime that helps topreserve them for generations to come. *books, journals, maps, music, movies, e-mail, corporate records, government documents, course materials, data sets, and databases. It also covers scientific models, lab notebooks, parish records, familyhistories, and global weather data *DSpace is storing and preserving materials just like these at MIT and 100 other organizations worldwide. *DSpace has a growing group of committed programmers distributed across the globe who continually maintain and improve it. *consider the ultimate goal. Is our objective to be able to read, hear, or play something a hundred years from now? Or is it to prove provenancein a court of law—for instance, that certain clinical test data really are the original ones we based a particular drug on? The former is hard, but the latter is really hard. *authentic *ensure that nothing’s been changed since the item was stored. *data preservation is really a mix of the simple and the complex challenges. *“save the bits” approach. *Usually, saving the bits using standard, well-documented data, video, and image formats, such as XML, MPEG, and TIFF, gets you halfway *avoid formats that require proprietary software, such as AutoCAD or QuarkXPress, to play or render the data. *[[Microsoft Word]] is so popular that we can expect such migration programs to ermerge from third parties. *Input proces bij DSpace simple to ensiru that people actively contribute to the archive. *whether she wants to assign a Creative Commons License to it. This license gives other researchers the right, among other things, to include the article as part of their course readings or quote it in their own scholarly writings, without asking for herexplicit permission. *Among other things, it assigns a “date.available” value to the metadata record of the item, storing and indexing the metadata in a database and making the article available on the DSpace Web site. *A few days later, the article will start to appear in scholarly indexes and Web search engines like Google. *make a second copy of the article in Adobe PostScript and a third in plain ASCII text, using currently available software tools todo the format conversions. *Now imagine that a few years have passed, and Adobe announces that it has developed a new format, “PDG,” and will no longer sell tools to process PDF documents. *… The curator then runs a query in DSpace to find all the PDF files in the archive, acquires or creates a program to automatically convert PDFs into PDGs, and runs the conversion. *Just as biodiversity is good for the natural environment, different digital preservation policies and strategies are good for the preservation environment *But to ensure that we don’t wind up with a digital Tower of Babel, we need to agree to use open, published standards, such as XML, TIFF, PDF, and MPEG.

===Links * *Internet Archive’s Wayback Machine *LOCKSS program to preserve digital content at