Thursday: 15:30 – 17:00 (90 minutes)
To Alison Heatherington:
The MIXED framework has the intention to implement the “smart migration” strategy. As there are many file formats, priorities have to be set concerning the selection of file formats for which plug-ins will be created. Do you see a role for a format registry (such as PRONOM or UDFR) in this file format assessment process?
- Short answer: yes. Pronom is technical registry with information about file formats, but also software, hardware and compression algorithms. Details of the format in order to make a risk-score. Different institutions might have different requirements. DROID is identification tool, developed in line with Pronom. Currently there is a consultation round on DROID. Identification based on an internal code. If there are specific questions, we can go to the site and make remarks. UDFR is at very early stages. Requirements are not fixed yet.
- Dirk: can you elaborate on pathways?
To Amir Bernstein:
A simple and short question: To what extent can MIXED and SIARD benefit from each other?
SIARD: XML based. We use a single file to preserve the database, ZIP64 uncompressed (container). For each table we have another folder. Blops go in a different container. MIXED seems to go in the same direction. Also project eDavid and project in Portugal. Essence: concentrate on the data and less on the presentation. SIARD is related to mandate of the national archive: archive official sensitive databases of government. Can we see schema. SIARD schema can be downloaded from website. Synergies: look at the schemas. Find a way to improve the schemas. Look at the metadata. Usability: look at the interfaces.
To Barbara Sierman:
What is your opinion on the “smart migration strategy”, especially in relation with the “common” emulation and migration strategies for digital preservation?
- It depends on the type of material for which type of migration to use. At the moment most of the material is text based (pdf), not much databases. By changing the strategy a thorough preparation is required. It is more an organisational than a technical issue. Planets project contains information on tools for preservation. It is important that MIXED can be part of the Plato tool and the planets testbed.
To Ellen Kraffmiller:
The Dataverse network project is an open-source software development community. Do you have any recommendations concerning the formation of this community as well as ways to keep this community active? How does the open-source philosophy fit in the business model of the DVN?
To Jeroen Rombouts:
What is your experience regarding the durability of file formats used by researchers in technical disciplines?
The durability of the file formats are not so problematic. What is more problematic are the metadata. What is the value of the data. How many SDFPs would we need? The database type already has a lot of issues: time, geography, etc. can be essential. Currently a lot of questions on the codes and models that relate to the datasets.
To John Doove:
Can you give some recommendations concerning the way the user community (both users and developers) of the MIXED system can be extended?ld
1. Connection with the dataforum initiative of SURFfoundation
2. Suggestion: to use RDF format in the MIXED project
Barbara Sierman: repositories-concept has a broad horizon. Not only for universities. Some action concerning digital preservation should be included.
To Marc Kemps-Snijders:
Building on your experience as a developer of software used in scientific settings: can you provide some suggestions concerning the way the software development process can be optimised and continued?
To Nathan Adams:
To what extend can it be expected that data centres actually use and contribute to the MIXED software? What requirements must be met?
To Rainer Schmidt:
Can you elaborate on the way the MIXED software and outcomes of the Planets project might benefit from each other?
In general Planets the project tries to provide an environment with as many as possible tools in order to find a strategy that best fits the situation. It possible to put the MIXED plugins behind the Planets framework. There now about 50 tools in the Planets environment. Nice experiment: roundtrip experiment. This also bring public visibility to the tool. Rainer Schmidt works in interoperability workgroup. Aspect of provenance: Rainer has suggestions to cooperate on this.
To Rob Grim:
What can be done to improve the durability of research data in the social sciences?
I like strategic and practical approach. There is a huge interest in making historical data available. Also from the pre-digital time (only available on paper). What to preserve? Difficult question. Still in discussion. The context is important. The infrastructure for data (compared to publication) is under developed. How relates MIXED to other digital preservation issues. How does this align with ORE (resource maps). “May al your problems be technical”?
To Sebastian Rahtz:
Addressed as a representative of the DARIAH initiative: What is the value of the MIXED framework for the DARIAH initiative? How do you look upon the SDFP as a vehicle to …
plus: Not reinventing a wheel. Building on existing knowledge
minus: worry: subject area. Database and spreadsheets are very short on semantics. Risk of losing semantics by simplicfying content model
minus: Standards are not for long period (say about 10 years)
minus: Latex format contains semantics in the document.
To Steven Krauwer:
What are the most important mechanisms to improve the durability of digital data in the scientific speech and language community?
If Clarin wants to provide access to tools, preservation is relevant. Adhere to standards is crucial. There are to many to manage. What is the intended scope of MIXED? Local? International? There is need for preservation of enhanced data. They are more complicated than simple formats. Also attention for audio and other new formats.
To Vladislav Makarenko
Is digital durability an issue in the eSciDoc project? If so, how is it implemented?
- Using standards is important. DC, ontologies. For interoperability a number of standards are used. Open Source is important. Software can be used by others.
Submitted by dirk on Tue, 2009-09-22 16:15.