you are not logged in

Navigation

User login

General

Problem description

The problem MIXED is going to solve is: storing databases and spreadsheets in a manner that is digitally durable. MIXED will not cover all aspects of durability in storage. The focus of MIXED is converting the data into a form that guarantees a long lasting usability of the data.
MIXED will define such a format, using as much of existing de jure and de facto standards as possible.
In the process, MIXED should gain the attention for its method from the potential users.

Method

MIXED will define and implement a generic format to store data in. Implementation means: developing software modules that convert data in current databases and spreadsheets into that generic format, and, conversely, convert data in generic format to database formats and spreadsheet formats that are currently in use. During the definition stage of the project we will specify which current formats MIXED will support.

Gain

Once data has been converted in a generic format, it has been freed of the ties that bind it to the applications with which it was created. It is hard to ensure that any application we see at work now, is still operable in the distant future. Much easier is it to guarantee that static data, with a clear and explicit structure, remains intelligible for humans and computers in the distant future. So the problem of making digital data last will be tackled by a method that reduces the effort for the future.

Moreover, probably nobody in the distant future wants to run, manage and maintain the applications of the past, with all their different versions, kept alive in a kind of vegetative state, just to view archived documents.
The approach that MIXED adopts, i.e. archiving the semantic core of digital data, is expected to lead to more desirable results in the future, especially where scientific data is concerned.

Pain

In order to let the method work, we must make two kinds of efforts:

  • automatic conversion of data
    • at ingest: a one-time (automatic) conversion of the data into the generic format that represents the semantic core
    • at dissemination: an (automatic) conversion of the data from the generic format into the desired application format
  • future manual programming of new versions of the convertors
    • new convertors for new application formats
    • updated convertors for new versions of applications

Products

The intended software products of MIXED are:

  • an initial set of dataconversion modules, that convert data between the supported application formats and the generic format
  • a master application by which users can control the modules in various scenarios, e.g. as plugin for the EASY storage system, or as data extractor for database under control of data producers

Values

DANS's mission is to be the repository of choice for research data. An important first step was the realization of EASY, being an OAIS compliant archive. The next thing on the wish list is measures for durability. As soon as MIXED has delivered its results as an add-on to EASY, DANS will be the provider of a strong archiving method and service.

Limitations

Clearly, MIXED has to restrict its ambitions in order to get the first steps from idea to reality done. This is what we do and what we leave out:

  • we treat databases and spreadsheets only, no documents in general, not even when they occur inside database fields
  • we do not treat all aspects of spreadsheets and databases. In particular, all logic concerning mutating data will be stripped from the archival representation, and probably we treat only a small subset of the possible formulas in spreadsheets, and the text formatting we might encounter in spreadsheets
  • the software we intend to produce will have the status of production software as a module on top of EASY. Deploying the MIXED software to many other archive environments is not part of the task
  • post-project assistance will consist of no more than
    • package the software, advertise it, distribute it
    • start a new project for further development
    • interest third parties as 'resellers' and deployers of MIXED software

prototype

Do we deliver a prototype according to the project proposal. Is this mentioned in the proposal. Think it is good te make this clear because it is influences the results of the project. But it must be checked if this is according the proposed results in the proposal.

production

I have removed the assertion on this page that MIXED should deliver just a working prototype.

Dirk

prototype or product

The initiation document specifies that MIXED must deliver a production system (2.4). It is the focus of the first stage. The first stage also defines M-XML and the overall application that calls the individual convertors. The second stage builds the convertors, but only if the first stage was successful.

This points definitely to a production system. Also WorkPackage 9 lays the foundation of productive use of the system. So, indeed, I have to withdraw the statement that we aim at a working prototype. We aim at a working product.

Question: should we fix MIXED unto EASY, or should we develop something that acts as a working prototype when used stand-alone, and as a product when used on top of EASY? I mean: if you take very seriously that MIXED be a product, you should package it with an installer, and make a support organization, helpdesk and whatnot, which is probably too much.