Frances Madden (orcid.org/0000-0002-5432-6116)
Posted 13 December 2021
On 26 October 2021, a panel of experts came together to discuss the role persistent identifiers, or PIDs, can play in IIIF resources. The International Image Interoperability Framework, or IIIF, is an increasingly popular standard for displaying image collections online. It is very flexible and uses manifests, in essence, digital packages that contain the information related to a particular digital object, including the images, audio-visual materials, and metadata, all of which is accessible via a URL. End users can create custom manifests, annotate items, and create bespoke collections for themselves.
However, all of these user created activities are dependent on the links to the content remaining active and not changing over time. If the links in the manifest of the items were to change then the original content would become inaccessible. Additionally, as infrastructure is upgraded or new imaging is created how are we going to manage identification of resources which have changed but could be used for the same purpose?
Our panel and attendees got deep into the nitty-gritty of how best to use PIDs within IIIF’s infrastructure. Some of the conclusions were that end users need to be able to trust that the content provided would remain accessible for the long term. To an extent, it is known and understood that IIIF requires persistent links as part of its implementation; however, this is not alway mirrored in institutional practice. Additionally, questions arise when content is created on top of those resources, which may or may not be created by individuals with no formal affiliation to the institution. While it is theoretically possible to create a PID for every type of annotation - such as a transcription, translation, link of note - several of the panelists and attendees questioned if this was the best approach. For example, if every aspect of each IIIF resource is assigned an identifier then the number required would be huge and there could be significant challenges in managing identifiers at such a scale. In addition, not all annotations are equal. Some are created by students as class assignments whereas others may be high quality, publishable research outputs. Is it the role of institutions to manage these resources? And, if so, how can institutions differentiate between varying classes of content? If they were to assign an identifier to everything created in relation to a digital object, it would need to be made clear that the identifier did not imply any standard of quality.
The consensus emerged that it was vital for GLAM institutions to maintain the persistence of the resource on which annotations sit. Those annotations and transcriptions may be created in other services and those may make use of PIDs, but these identifiers would need to be maintained by the service in which those annotations are created. Alternatively, text resources could be added as metadata on top of the original manifest by the original publishing institution or added by individual users who could then serve a custom manifest from an independent repository.
The panel also recognised that annotations and transcriptions could be enhanced as resources through the use of PIDs, by facilitating contributions to be linked to ORCID IDs to increase their visibility and prominence as scholarly outputs.
When it comes to solutions for managing identification of versions and related digital objects, there was a lively exchange of ideas. The use of suffix passthrough in the ARK standard, was particularly prominent as it allows institutions to assign an opaque ARK to an object but then link to different versions or objects related to it, whether that be descriptive metadata, a thumbnail version, or a IIIF manifest.
Another issue was the level at which PIDs should be assigned and some different use cases were mentioned including assigning a PID to the intellectual entity in contrast with assignment at the level of a discrete digital object which hashes or other models require. The panel concluded that a pragmatic approach should be recommended based on an institution's capability and the expected use cases for collection.
For smaller organisations, there are a lot of existing tools which use IIIF, which many use for hosting their collections. For example the Internet Archive and OCLC’s ContentDM service. If these systems were to implement PIDs explicitly, this would easily address the difficulties smaller institutions may have in maintaining persistence with regard to their digital resources.
At the end of the discussion, there was a brief mention of whether there would be any issue in applying PIDs to 3D images or time-based media displayed using IIIF. The conclusion was that there were several implementations doing this already and which allow for the citation of excerpts of videos, but there was no mention of any specific issues pertaining to those links being persistent.
When asked to sum up, several of the panellists concluded that the infrastructure to use PIDs in IIIF is already there and that it just needs to be consistently implemented and used by the community. It is clear that PIDs can be used in IIIF, and there are several models of how this can be done; however, the use of PIDs could be improved if the IIIF community could establish guidelines to facilitate the adoption of a pragmatic approaches to enable as many organisations as possible to implement PIDs with regard to their IIIF resources.
The session was recorded and can be accessed here. All related materials from the session are available via Zenodo, a repository system which assigns digital object identifiers (a type of PID) to resources held within it https://doi.org/10.5281/zenodo.5721406. The event was co-organised by the PIDs as IRO Infrastructure and Practical Applications of IIIF projects, both foundation projects in the Towards a National Collection programme.