you are not logged in


User login

0015 - sc Research: Investigate usability of the ORE MODS container

Can the ORE container be used to store the converted object(s) PLUS metadata, including provenance? - Can it be used to exchange via harvester (OAI-PMH) protocol? In relation to the MIXED framework.



Using the OAI-PMH (Protocol for Metadata Harvesting) protocol in the Mixed framework:

Perhaps not for the actual conversion request, but the result of a conversion request could be a (list of) identifiers (e.g. job numbers), which the client then uses to retrieve file metadata and provenance metadata about the conversion. It uses the url to (for instance wrapped in a dc:identifier in dublin core) to retrieve converted file itself.

An web application would serve the OAI-PMH requests and call into the the framework to retrieve the provenance data from the reporter bean.  The mapping  to requests is straight forward:

  • GetRecord retrieves information about a single job
  • ListRecords retrieves all jobs in a single batch (where Set is the batch number)
  • ListIdentifiers would answer all jobs numbers in a batch
  • etc.

Some sort of file expiry mechanism would take care of deleting old conversion results.

There are existing libraries in various languages which would help implementing the client and server side of this protocol:

Implementation tasks (assuming provenance meta data is available in the reporter database):

  • File expiry manager for the framework: 2d
  • Design OAI-PMH protocol mapping to MIXED requests: 1d
  • OAI-PMH web application 5d
  • HTTP POST interface for file conversions (to allow for easy integration with the harvester client below): 3d
  • OAI-PMH harvester command-line client for MIXED (could be prototyped in perl) 3d



Concerning ORE:
ORE is described as: "Open Archives Initiative Object Reuse and Exchange (OAI-ORE) defines standards for the description and exchange of aggregations of Web resources" (source).

At first glance this appears to be very heavy weight for usage within MIXED, for the moment we are interested in registering relations between several components that may be the result of a conversion. These components will be straightforward and little information is needed to describe the relations between components.

It has to be determined whether MIXED will offer more functionality in the future that would endorse the usage of OAI-ORE.


Nice piece of research Pieter. This makes me wonder about possible future expansion of the business case for MIXED. With techniques like ORE and PMH it would be easy to create a front-end for archivers. All we would need to add then is a repository to offer a full archival solution. Even though this kind of functionality may not be desired or achieved within the current timeframe of MIXED it may be something to consider preparing interfaces for. When interfaces are declared for a usable front-end and a repository back-end then I think MIXED will appear more attractive to archiving institutes