Back to the future - National Archives and Microsoft announcement
Remember Windows 3.11? Office 3.0? Still got any floppies or DAT tapes gathering dust on your bookshelf?
Over the last few months I've been in a bit of a time warp... one minute Silverlight... then the next in DOS 6 and considering the implications of getting access to digitally born documents and applications from 10+ years ago. How do documents get stored and migrated to ensure they can be read in the future? How do you make sure those documents can be viewed exactly as they were intended/created? What about the lifecycle of that document? - the changes history?, annotations?, embedded fonts?, ... quite a minefield!
Thankfully the guys I've been working with at The National Archives live and breath this stuff! They have the responsibility to conserve the nation's paper-based and digital heritage and to make it accessible to those who want to view it. Phew!
Anyone remember the Domesday Project 1986 laser discs that Blue Peter buried in their garden? All was well... using the latest and greatest technology presuming it had a strong future... and the nightmare of trying to find a laserdisc reader for a BBC Micro just a few years later. That sums up the problem for me!
The announcement we just made with the National Archives is trying to address the issue of digital conservation head-on. With billions of documents in the world wrapped up in proprietary document formats (from Microsoft and many many other vendors) we felt it was important to focus on how we can help the body in the UK which has the biggest headache and do what we can to assist them in:
- Migrating documents to the latest Office format (Open XML) via our document conversion tools to ensure they can be accessed by the public in the future
- Ensuring legacy documents can be viewed as accurately as possible when compared to their original
- Determining the best way to migrate certain documents
- Understanding what version of tools a document was created in so a conversion process can be automated
To support these aims we evaluated the key Office and Windows combinations that have shipped and looked at some of the typical types of documents in the archive. We built a set of Virtual PC 2007 virtual hard drives containing those O/S and Office versions and made them available to the folks at the Archives to use in their on-going document conversion process.
[Gordon Frazer demonstrating the VPC library]
At the press launch, I demonstrated the new National Archives Virtual PC 2007 library of previous Microsoft operating systems and Office suites going back to Office 3.0 on Windows 3.11. Remember that beautiful white background? the chunky icons? the "easter egg" with the cast role of the developers (by clicking the yellow flag in help about with ctrl + shift a few times)? It all came flooding back and really made everyone realise how far things have moved in less that two decades!
The Virtual PC 2007 environment is going to provide an effective way for documents to be viewed in the original context in full fidelity and to enable step-by-step version upgrades to be performed if some document fidelity is lost in other conversion approaches.
Open XML is an Ecma International standard and, once documents, spreadsheets, presentations, etc. are converted to the various Office XML formats we should be in an easier place to keep migrating documents forward. With XML being based on text we stand a good chance!
So why are we doing this now? Well, we've actually been working with The British Library and The National Archive for about 18 months now on digital preservation with some other European organisations as members of an EU project called Planets.
I think its fair to say that we are still near the start of getting the digital preservation problems sorted, but I'm pleased to say we're actively engaged in listening to the issues and taking some real action to make the digitally-born legacy of documents readable for our kids, our kids kids, ...
Take a look at the following articles for more background on the announcement:
http://news.bbc.co.uk/1/hi/technology/6265976.stm which also has a video interview with the Microsoft UK MD Gordon Frazer - my laptop's 15 minutes of fame! :-)