The problem MIXED is going to solve is: storing databases and spreadsheets in a manner that is digitally durable. MIXED will not cover all aspects of durability in storage. The focus of MIXED is converting the data into a form that guarantees a long lasting usability of the data.
MIXED will define such a format, using as much of existing de jure and de facto standards as possible.
In the process, MIXED should gain the attention for its method from the potential users.
MIXED will define and implement a generic format to store data in. Implementation means: developing software modules that convert data in current databases and spreadsheets into that generic format, and, conversely, convert data in generic format to database formats and spreadsheet formats that are currently in use. During the definition stage of the project we will specify which current formats MIXED will support.
Once data has been converted in a generic format, it has been freed of the ties that bind it to the applications with which it was created. It is hard to ensure that any application we see at work now, is still operable in the distant future. Much easier is it to guarantee that static data, with a clear and explicit structure, remains intelligible for humans and computers in the distant future. So the problem of making digital data last will be tackled by a method that reduces the effort for the future.
Moreover, probably nobody in the distant future wants to run, manage and maintain the applications of the past, with all their different versions, kept alive in a kind of vegetative state, just to view archived documents.
The approach that MIXED adopts, i.e. archiving the semantic core of digital data, is expected to lead to more desirable results in the future, especially where scientific data is concerned.
In order to let the method work, we must make two kinds of efforts:
- automatic conversion of data
- at ingest: a one-time (automatic) conversion of the data into the generic format that represents the semantic core
- at dissemination: an (automatic) conversion of the data from the generic format into the desired application format
- future manual programming of new versions of the convertors
- new convertors for new application formats
- updated convertors for new versions of applications
The intended software products of MIXED are:
- an initial set of dataconversion modules, that convert data between the supported application formats and the generic format
- a master application by which users can control the modules in various scenarios, e.g. as plugin for the EASY storage system, or as data extractor for database under control of data producers
DANS's mission is to be the repository of choice for research data. An important first step was the realization of EASY, being an OAIS compliant archive. The next thing on the wish list is measures for durability. As soon as MIXED has delivered its results as an add-on to EASY, DANS will be the provider of a strong archiving method and service.
Clearly, MIXED has to restrict its ambitions in order to get the first steps from idea to reality done. This is what we do and what we leave out:
- we treat databases and spreadsheets only, no documents in general, not even when they occur inside database fields
- we do not treat all aspects of spreadsheets and databases. In particular, all logic concerning mutating data will be stripped from the archival representation, and probably we treat only a small subset of the possible formulas in spreadsheets, and the text formatting we might encounter in spreadsheets
- the software we intend to produce will have the status of production software as a module on top of EASY. Deploying the MIXED software to many other archive environments is not part of the task
- post-project assistance will consist of no more than
- package the software, advertise it, distribute it
- start a new project for further development
- interest third parties as 'resellers' and deployers of MIXED software
Submitted by dirk on Fri, 2007-01-26 16:22.