This page offers recommendations and guidelines for those undertaking similar recovery projects.
Stages of Digital Resource Creation Process
Epistemological and Methodological Considerations
- Collect datasets, ancillary documents, if possible speak to members of the original project team
- Decide on the purpose of the project, it could be to: (Re)distribute the data, (Re)create a resource for the historical record or (Re)present the data in a new way or to a new audience
- The purpose directly impacts the methodology that you will use. You could choose from, or combine, any of the following: Redistribution, Repurposing, Restoration or Emulation
- Consider how you wish to present the data and let the purpose of your recovery project dictate the tool you select, rather than allowing the tool you select dictate how your data can be represented. Your options could include: a decision not to interpret the data and simply present it as is; a repackaging of the dataset in a different format; some level of interpretation using knowledge representation tools such as maps and timelines; or as an enhanced dataset - using encoding or Linked Open Data structures to add value.
Tool Selection Considerations
Standards, Open and FAIR
Regardless of the manner in which your data is presented always ensure that the tools and technologies adhere to the FAIR Principles and Open Access requirements. This is no longer a negotiable requirement for research outputs. It is now a requirement of all public research funding. Ensuring that your digital resource is OAI compliant means that it can be ‘seen’ by other research repositories and data aggregators, without any human intervention, something that is facilitated by the structured metadata and associated protocols.
Further information on FAIR can be found at: https://www.go-fair.org/fair-principles/
Further information on OAI can be found at: https://www.openarchives.org/pmh/
Metadata
Metadata is not as complicated as it sounds. Often described as ‘data about data’, it is simply descriptive information about an object - examples of metadata for a book might include the title, the date and place of publication and the author name. A metadata standard is an agreed set of such terms, which allows for different objects to be described in a similar way. There are many different metadata standards but one of the simplest, most flexible and most commonly used is called Dublin Core. It has only 15 core terms, and you don’t need to use them all when describing an object. Every librarian in your organisation will understand what metadata is, and they are more often than not happy to explain how to use it.
More detailed information about Dublin Core can be found at: https://www.dublincore.org/
Begin by making a project lifecycle plan
Involve all stakeholders from the beginning. Ensure meaningful engagement and agree a shared vision. Then write it down. Decide who is responsible for the different aspects of the project and ensure buy-in from the beginning. Consider what support the digital resource might require after the project ends. Create a data management plan and identify stewardship requirements. There are many organisations offering free tools and templates for the creation of both project and data management plans.
For examples see the Digital Curation Centre at: https://www.dcc.ac.uk/resources/data-management-plans/guidance-examples
King’s Digital Lab has a GitHub resource containing many useful templates. It can be accessed at: https://github.com/kingsdigitallab/sdlc-for-rse/wiki.
Start small
Always start with a small pilot dataset. Make this only as large as required to cover the main variants in your legacy dataset. The time spent finessing protocols and mappings at this stage will be repaid many times over.
Document design decisions
If you’re taking a pre-existing dataset and changing its format or structure, whether by adding or removing content, document the process and the reasons behind it. A simple list of design decisions is invaluable to anyone building on your work.
Create a data dictionary
Create a simple document that explains all of the terms you use with your dataset. Analogous to a glossary, consider that people reviewing your material 20 years hence may not fully appreciate what you meant by ‘author_role_affiliate’, provide a brief explanation and an example for each term.
Create a project summary document
Use simple terms to provide an explanatory document to accompany your resource. Detail the scope and limitations of your resource, describe the methodologies, tools and standards that you have used. Include a reference to an archived version of the resource. Provide contact details for project members.
Dissemination and Preservation
Avoid broken links
Of the twelve digital resources listed on the munsterwomen.ie website in 2013 only two remain valid URLs. To avoid this happening to your digital resource create an archived version at the end of your project and use this link when referencing your own work. Link to this and not to the live site once it ceases to be maintained. Do this using the Internet Archive’s free Wayback Machine which can be found at: https://archive.org/web/web.php
Create a DOI
For all digital outputs consider assigning a DOI. A Digital Object Identifier is a unique reference number, issued as an ISO standard, for digital objects. Their creation is controlled and managed on a not-for-profit basis by a number of registered agencies. Zenodo creates a DOI automatically upon deposit of material in their repository, and offers DOI versioning, allowing you to edit/update files after a DOI has been assigned.
Options for Dissemination and Preservation
Zenodo is an offering of the EU Open Science initiative operated by CERN and OpenAIRE - uploaded research is given a DOI and stored by CERN, an organisation with long-standing funding support. It also offer GitHub integration. It accepts all file types, and all research outputs. Content can be made publicly accessible, remain hidden, or be restricted to specific groups. There are no costs associated with depositing material in Zenodo. More information can be found at: https://zenodo.org/
The Internet Archive is a non-profit digital library. Use the Internet Archive’s Wayback Machine to create a digital snapshot of your website. This doesn’t require a user account. You can also create a free account and upload content which you have the right to share. It accepts all file types, and all research outputs. All content is publicly accessible. There are no costs associated with depositing material in the Internet Archive. More information can be found at: https://archive.org/
Wikimedia is a non-profit foundation, it provides a number of tools and platforms to make data widely available and freely accessible. It is community-driven project, which allows anyone to upload and edit information, for that reason is not often considered as an appropriate method of scholarly dissemination. However its ubiquitous nature, and user friendly platform, more than compensate for its occasionally unreliable information, and should not allow it to be overlooked as a useful means to communicate information to the wider public. More information can be found at: https://www.wikimedia.org/
GitHub has traditionally been associated with software projects, however it is at its core a digital repository which accepts all file types, and all research outputs. Users can create wikis and provide all manner of explanatory documentation. The content can be confined to specified user(s), or made publicly accessible. GitHub integrates with Zenodo. There are no costs associated with creating your own GitHub repository or depositing content. More information can be found at: https://github.com/
An additional note on Irish websites
The National Library of Ireland is not currently supported in its collection of digital-born content by legislation and there is no .ie domain crawl. Contact the NLI and request that they harvest your site to the Irish Domain Web Archive which they manage. More information can be found at: https://www.nli.ie/en/irish-domain-web-archive.aspx or by contacting webarchives@nli.ie.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.