The Concept of Digital Record according to InterPARES
InterPARES (1998-2026) is an international collaborative research project funded by the Social Sciences and Humanities Research Council of Canada whose goal is to ensure the long term preservation of authentic digital records across technologies.
When the InterPARES project began, it was clear that, being the researchers from different disciplines and cultural contexts, they needed to agree on basic concepts. The key concept for the project was that of digital record. We decided to maintain the archival classic definition: a record is a document made or received by a physical or juridical person in the course of activity as an instrument and byproduct of it, and kept for action or reference by such person or its legitimate successor. As records form the infrastructure through which beliefs and values are upheld and understood and societal institutions supported, we had to identify their characteristics in the digital environment.
Born digital records are vulnerable (easy to destroy, lose, corrupt, tamper with, or become inaccessible) and persistent (forever there, if not purposefully destroyed). Their content, structure, and form are no longer inextricably linked. They are made of entities stored in the system in which they reside, and most of them have also a documentary manifestation that allows them to be understandable by humans. The “stored record” is constituted of the digital component(s) used in re-producing it, which comprise the data to be processed in order to manifest the record (content data and form data) and the rules for processing the data, including those enabling variations (composition data). The “manifested record” is the visualization or instantiation of the record in a form suitable for presentation to a person or a system. Sometimes, it does not have a corresponding stored record, but it is re-created from fixed content data when a user’s action associates them with specific form data and composition data (e.g. a record produced from a relational database). Every time we close the manifestation of a digital record we destroy it, and when we open it again, we are generating a copy. It is not possible to preserve a digital records. We can only preserve our ability to reproduce or recreate it.
The “manifested record” is the visualization or instantiation of the record in a form suitable for presentation to a person or a system.
We know that all records must have fixed form and stable content. What does it mean in the digital environment? InterPARES has established that an entity has fixed form if its binary content is stored so that the message it conveys can be rendered with the same documentary presentation it had on the screen when first saved (even if its digital presentation changes, say, from Word to .pdf). An entity has fixed form also if the same content can be presented on the screen in several different ways in a limited series of possibilities: we would have a different documentary presentation of the same stored record having stable content and fixed form (e.g. statistical data viewed as a pie chart, a bar chart, or a table).
An entity has stable content if the data are unchanged and unchangeable, meaning that they cannot be overwritten, altered, deleted or added to. InterPARES also introduced the concept of “bounded variability”, which is present when changes to the documentary presentation of a determined stable content are limited and controlled by fixed rules, so that the same query, algorithm, or interaction always generates the same result. Bounded variability also exists when we have different views of different subsets of content, due to the intention of the author or to different operating systems or applications.
There are two types of digital records: static and interactive. Static records do not provide possibilities for changing their manifest content or form beyond opening, closing and navigating (e.g. email, reports, sound recordings, motion video, snapshots of web pages). Interactive records present variable content, form, or both, and the rules governing the content and form of the presentation may be either fixed or variable.
Interactive entities may be non-dynamic or dynamic. We have non-dynamic entities when the rules governing the presentation of content and form do not vary, and the content presented each time is selected from a fixed store of data (e.g. interactive web pages, online catalogs or inventories, records enabling performances): they are records. We have dynamic entities when the rules governing the presentation of content and form may vary: they are either information systems or potential records.
Dynamic entities are either information systems or potential records.
Potential records are in systems where the variation is due to data that change frequently, because the design permits updating, replacement or alterations (e.g. students register); or that allow data collection from users or about user interactions or actions (e.g. faculty self-service portal); or that use the data input by users to determine subsequent presentations (e.g. land registry). They are also in systems where variation is due to data received from multiple external sources at different times and not stored within the system (e.g. GIS). They are presently not records systems but should be made into records systems if the sets of data they produce and hold fulfill records functions and the users rely on them for action and need to maintain them. InterPARES has carried out several case studies in the arts, sciences and government aiming to ensure that such systems create and maintain records.
Issues about the authenticity of digital records and the ability to verify it through time, regardless of where the records are stored, have been addressed by all four completed phases of InterPARES.1 The fifth phase, which started in April 2021, focuses on the development of Artificial Intelligence tools based on archival concepts for the creation, maintenance, use, selection, intellectual control, preservation and access to trustworthy digital records.2 Thus, there is much work going on in our terminology database.3 Who knows what a record will be five years from now?
- 1 See for example: http://www.interpares.org/book.... As it regards records kept in the cloud environment, see www.interparestrust.org.
- 2 See www.interparestrustai.org.
- 3 See https://interparestrustai.org/....