Is there anything to be learnt from Harvard's attempt to preserve its digital assets?
5 min read
08 June 2015
Last week I read with interest on e-Science News that Harvard University has begun the extensive task of preserving its rapidly ageing digital materials. Higher education is one of many areas in which there is a growing amount of traction for this issue, with banks of research data continuing to grow while the digital formats upon which the material is stored begin to age.
In 2014 I was delighted to hear the Vatican discuss its intentions to digitally preserve the contents of its library, and so it is encouraging to also hear Harvard, another world-renowned institution, describing the task of ensuring that its assets “live on” as “one of the most pressing issues in preservation science”. After all, if any university is going to take the task of preserving its digital assets seriously, it’s going to be Harvard.
The Harvard libraries and archives contain an immense volume of digital information that has been gathered over several decades, and is therefore currently stored on hundreds of different formats that are quickly becoming outdated. When this digital material first began to enter libraries in the 1980’s on floppy disks and tapes it was largely logged and tucked away as simply a growing collection of artefacts, and so a substantial amount of data may not have been accessed for 30 years, let alone archived or converted to a sustainable format.
As a result, the Harvard librarians are now scrambling to move this data from the quickly aging formats upon which it is currently held to a modern medium that we can be confident will still be accessible in the near future, and understandably so. A recent study written by Timothy Vines has found that with every passing year, the odds of a data set that was published in the last 22 years being retrievable falls by 17 per cent. The degradation of a dated piece of digital data is therefore a process that may not occur for several years, but can suddenly and rapidly take irreversible effect; a looming threat that is driving the librarians’ urgent work.
Read more about data:
- Ailbaba uses data analytics to link SMEs with trusted Chinese suppliers
- A trip to one of those mysterious data centres your business is becoming increasingly reliant on
- Data security breaches – a silence a virtue?
However, I am concerned that Harvard is so intent on ensuring that this data is retrievable today, it may be failing to fully explore the need to archive and to preserve it in a way that will be secure for generations to come. The Vatican is reportedly preserving its digital assets using open-source, non-proprietary software with the specific aim of ensuring that the data is still accessible in 50 years. I feel that a similar solution should also be the next stage in Harvard’s plan.
However, it is important to remember that Harvard is not the only university facing the task of rescuing digital material from dated formats, and indeed that universities are not the only institutions to be tackling this issue. For example, only 50 per cent of American films shot before 1950 were expected to survive past the year 2000, and it is believed that around 80 per cent of silent movies made in the 1910’s and 1920’s have now been lost largely due to irreversible neglect. Vint Cerf, a vice-president of Google, is concerned that without a rise in awareness of the importance of correctly preserving digital materials, future generations will have little record of the 20th Century and will enter “a digital dark age.”
Harvard needs to investigate the use of storage facilities that will ensure all of its data is accessible in ten, 25 or even 50 years time, so that this wealth of knowledge can be professionally preserved to the highest possible standard, and stored in such a way that it can all be returned quickly, easily and in exactly the same condition to that in which it was left. The efforts currently being made by Harvard’s librarians to stop the decay of this unique digital data are an excellent first step, but effectively planning a long-term archiving strategy today, similar to that being devised by the Vatican, is the only way to be certain that digital material will be safe for the use of future generations.
Nik Stanbridge is director of marketing at Arkivum.