NISO/CLIR/RLG – Technical Metadata Elements for Images Workshop (18-19th April 1999)

Maybe this is a record for delayed blogging. Nine and a half years late, here’s a writeup I found (on a corpsed hard-drive) of an image metadata workshop held by NISO, in Washington. I wrote it up for the JISC JIDI project at ILRT, who funded my trip. I’m sure they won’t mind it being shared here. Eric Miller and Paul Miller were at the workshop too; I remember working on the old Dublin Core RDF spec with them. NISO’s Workshop Report is also available from their site.

In April 1999, NISO organised a two-day invitational meeting whose aim was to gather requirements for technical metadata for digital images. The meeting addressed both architectural issues, and began the work of defining and categorising actual data elements for representing technical information about images. A report has recently been published on the NISO web service providing preliminary overviews of the agreed elements, including an comprehensive overview [report] of the two day meeting. This report for the JIDI group does not attempt to reproduce this work, but instead reports on a number of issues relating to the aims of JISC-funded services such as JIDI and TASI.

The workshop wrestled with a number of problems familiar to many from the non-imaging metadata community. In particular, it reaffirmed the conclusions drawn at the end of the earlier Dublin Core and Images workshop, namely that the requirements (and challenges) of image-oriented metadata applications were in many respects close to those of the bibliographic and text-oriented areas. Where digital imaging applications had specific requirements, these were in many cases related to the high mobility of digital image objects, and the frequency with which these objects were subjected to tranformations (eg. format conversion, editing).

A major concern which arose many times during the meeting was that of metadata loss through transformation. Many major software tools (eg. Photoshop) destroy or scramble embedded metadata. Conversion and editing of digital image objects tends, with currently available software, to damage embedded metadata. Given the high mobility of digital images, it is nevertheless appealling to explore models whereby metadata can travel with the objects from application to application. The meeting explored a number of frameworks whereby this could be achieved, including the use of a ‘container format‘ which might encapsulate both an image and accompanying metadata. The Java ‘JAR’ archive approach, which combines multiple files plus a metadata ‘manifest’ into a single portable object, was raised as a possible approach. A variant of this model was also discussed which used XML/RDF manifest files within a .ZIP or .JAR container to bundle both image data and metadata into a single transportable object. Although there was some interest in these approaches, no consensus was reached on the appropriate way forward, nor on whether this was a specifically image-related challenge or a general problem for the industry which might benefit from a generalised solution.

There was some consensus that digital signatures would need to be deployed over both image metadata, and the image data itself, for applications (eg. JIDI collections) which require some degree of quality assurance regarding the transformational processes that image collections have undergone. For digital signature technology to be applied to such content a canonicalisation algorithm is necessary. RDF and XML were raised as possible solutions in this area, although it was noted that the Signed XML initiative within W3C was not yet underway, and tha t the RDF working groups had deferred work on canonicalisation of RDF metadata until the issues had been more fully explored by the Signed XML group. Although digital signature technology can be applied to any content for which a text-based representation can be derrived, the issue of canonicalisation. Broadly, this means that applications which use digital signatures to make trust decisions, and which transform images, will ultimately benefit from a higher level of abstraction concerning ‘what it is that has been signed’. Current technology allows applications to think of simple textual files (eg. containing metadata) as being verifiable using digital signatures over that content. Future work will allow applications to instead make trust decisions relating to the logical ‘assertions’ represented in the files. The implication here is that trust-based metadata applications are at an early stage, but that we can anticipate within perhaps 2 years there being greater infrastructural support for applications which reason usefully about embedded metadata, eg. concerning the provenence, intellectual property rights and transformational history of some image object.

Why dig this up? Partly because I think a lot of practical work fed into the RDF design and it’s adoption eg. in Dublin Core, and this history is poorly documented. Also to kick myself for making prediction (” anticipate within perhaps 2 years”, phoey :) and because I’m looking again at signed RDF, largely in a FOAF/SPARQL context. Also I’ve just added a ‘foaf4lib’ category to the blog, as an offering to the code4lib aggregator and to accompany the new foaf4lib mailing list we have in the FOAF project.

Leave a Reply