Commentaires Résumé
2023/1 Les archives du futur

AI for Appraisal and Selection: A personal reflection

Commentaires Résumé

Looking back on archival experimentation and experience with using technology to assist in the task of appraisal and selection, what have archivists learnt and will AI be the game-changer it is hyped to be? Here, some thoughts are offered as a prompt to reflection.

Artificial Intelligence (AI) seen as a tool

Artificial Intelligence (AI) is a hot topic at the moment, but my instinct is to see it, less as something radically new and different, and more as just another episode in a long running saga. Archivists have always had to grapple with the implications for the record of new advances in information technology and as part of this, they have always sought to experiment (both in thought and practice) with how best to tailor these advances to support their recordkeeping purpose; how best to use the available technology to assist them in their task. When it comes to technology-assisted appraisal and selection then, where has our experimentation led?

AI for archival appraisal

One of the earliest experiments in this area appears to have been doctoral research carried out by Anne Gilliland in 1995 to develop ‘an expert assistant for archival appraisal of electronic communications’ (Gilliland-Swetland, 1995). This phrasing evokes earlier work in AI carried out in terms of expert systems, thereby reminding us that the field of AI research has itself a long and rich history going beyond the latest headline-grabbing developments in machine learning. The expert systems approach tended towards the encoding of expert knowledge into rules which would then be followed by the system. Such rules-based approaches to AI are still taken but have perhaps been overshadowed by the more data-based/driven approaches which have led to the current step-change in what is becoming possible.

Artificial intelligence (www.pixabay.com)
Creative Commons Zero (CC0)

Jumping forward a decade, further experimentation was carried out as part of the Paradigm project in 2005 to 2007. Paradigm involved archivists at the Universities of Oxford and Manchester envisioning in practice how to work with – that is to accession, process and preserve - born-digital material, specifically the private papers of a range of politicians. The main output of the project was a workbook in which they reported everything they had learnt from the experience. In terms of appraisal, they highlighted a range of tools that they had found useful. Tools mentioned included: Karen’s Directory Printer, Unison and Windows Explorer, but arguably what enabled these tools to be useful was the born digital and structured nature of the (meta)data the tools allowed them to extract and use. No artificial intelligence here, more technology assistance to get information (now available in a digital form as data) in front of a human intelligence faster.

The tools suggested by the Paradigm project were also able to manipulate and compare the information faster and this led to further practical experimentation, such as that reported by Victoria Sloyan in respect of the appraisal and sensitivity review of two hard drives at the Wellcome Library (2016). In this case, the superior information extraction and processing capabilities of tools such as DROID and Microsoft Excel enabled, for example, the quick identification (and elimination) of duplicate material and hence a reduction in the amount of material that needed to be appraised in more detail. Such an approach was also suggested by a contemporary experiment, this time one carried out at The National Archives (TNA) using eDiscovery software. It concluded that “software tools are able to facilitate a ‘funnel’ approach to analysing born-digital records collections”; they offer “ways to prioritise and reduce the volume of digital records that will have to be manually reviewed”.

The importance of scale

TNA’s experimentation with eDiscovery software points to another important factor in this equation, that of scale. One reason – perhaps the main reason - why assistance from technology is so urgently being sought by archivists is that, as the volume of material requiring ‘review’ increases, so does the possibility that no amount of ‘funnelling’ is ever going to reduce it enough to match the size of the manual resource we have available to do that review. The software TNA was experimenting with in the mid-2010s arose from recognition of this fact in another context – that of the legal field’s framing of a process of discovery; a process of reviewing an often vast amount of material to find that part of it which had relevance and value to the case at hand. And so, if scale does eventually make manual/human review or appraisal (as currently resourced) all but impossible, are archivists prepared to give it up? To move to a position where such appraisal becomes a fully automated process, one which they are happy to confine to and place into a black box beyond their (or anyone else’s) direct observation and control?

This scenario has been the subject of more recent experimentation by The National Archives as part of the Using AI for Digital Selection in Government project. This project sought to assemble a range of ‘machines’ and to teach them, using examples, to appraise TNA’s own corporate records as being either of long term value or not of long term value. One learning machine was assembled in-house and five others were procured from a range of external suppliers and consultants. Once taught, these machines then used what they had learnt to appraise previously unseen records. The appraisals they made were then compared with those made by TNA’s human record managers and in many cases the same appraisal was reached.

The future of archival appraisal?

What does this mean though and where do we go from here? Does it mean that the task of appraisal can be undertaken by classifiers built using machine learning techniques? To some extent, yes. Does it mean archivists will now delegate the task of appraisal to automated systems? Almost certainly, no. Indeed, is the real question at issue here actually whether archivists will ever be able to consider appraisal as just a task to be automated, rather than a responsibility to be borne? What will it take for them to be willing to relinquish that responsibility to others, however humanly or artificially intelligent they may be?

© Crown copyright, 2023. Re-used under the terms of the Open Government Licence v3.0

Avatar

Jenny Bunn

Jenny Bunn has over 25 years’ experience as an archival practitioner, educator and researcher. Her interests have always lain at the intersection of archives and technology. She is Head of the Archives Research of the National Archives, UK.

Bibliographical references

  • Gilliland-Swetland Anne, Development of an Expert Assistant for Archival Appraisal of Electronic Communications: An Exploratory Study. PhD Dissertation, University of Michigan, 1995.
  • Sloyan Victoria, «Born-digital archives at the Wellcome Library: appraisal and sensitivity review of two hard drives», Archives and Records, 37/number 1, 2016, S. 20-36
  • The National Archives «The application of technology-assisted review to born-digital records transfer, Inquiries and beyond: Research report» [online], 2016, https://cdn.nationalarchives.gov.uk (access date 31.03.2023)

Résumé

Artificial Intelligence (AI) ist derzeit ein heisses Thema, aber mein Instinkt sagt mir, dass AI weniger etwas radikal Neues und Andersartiges ist, sondern vielmehr eine weitere Episode in einer langjährigen Entwicklung. Archivarinnen und Archivare mussten sich schon immer mit informationstechnologischem Fortschritt und seinen Auswirkungen auf die Überlieferungsbildung auseinandersetzen. In diesem Zusammenhang haben sie immer (sowohl gedanklich als auch praktisch) versucht herauszufinden, wie sie diesen Fortschritt am besten auf ihr Überlieferungsziel anwenden können; wie sie die verfügbare Technologie am besten nutzen können, um sie bei ihrer Aufgabe zu unterstützen. Wohin haben unsere Experimente also geführt, wenn es um die technologiegestützte Beurteilung und Auswahl geht?

L'intelligence artificielle (IA) est un sujet brûlant en ce moment, mais mon instinct me pousse à la considérer moins comme quelque chose de radicalement nouveau et différent que comme un nouvel épisode d'une longue saga. Les archivistes ont toujours été confrontés aux implications des nouvelles avancées des technologies de l'information pour les documents et, dans ce cadre, ils ont toujours cherché à expérimenter (à la fois en théorie et en pratique) la meilleure façon d'adapter ces avancées à leur objectif de conservation des documents ; la meilleure façon d'utiliser la technologie disponible pour les aider dans leur tâche. En ce qui concerne l'évaluation et la sélection assistées par la technologie, où nos expériences nous ont-elles menés ?

Artificial Intelligence (AI) is a hot topic at the moment, but my instinct is to see it, less as something radically new and different, and more as just another episode in a long running saga. Archivists have always had to grapple with the implications for the record of new advances in information technology and as part of this, they have always sought to experiment (both in thought and practice) with how best to tailor these advances to support their recordkeeping purpose; how best to use the available technology to assist them in their task. When it comes to technology-assisted appraisal and selection then, where has our experimentation led?