What is FAIR data ?
FAIR data is an approach which is increasingly used in government, academia and in the pharmaceutical industry. The FAIR data principles are a set of aspirations to ensure that data becomes more available to others, either within your own organisation or to the world at large. The principles aim to make data:
-
Findable: by providing rich metadata to describe data and enable others to discover the information;
-
Accessible: by describing where data can be downloaded and any authentication and authorisation required to access it;
-
Interoperable: by describing the modelling within data and linking to other data;
-
Reusable: by declaring the licensing, describing the provenance, and following community practice defined within specific domains.
A key idea behind FAIR is that data and metadata should be available for both human and machine processing because this opens up more opportunities for data use. Metadata for a dataset should also be kept, even when a dataset is no longer available, for anyone looking to understand the context of historic data.
Why should you care about it?
Data collection is everywhere - government departments, pharmaceutical companies and research organisations may spring to mind as obvious groups using data, but even the smallest business, charity or local public service will have their own data assets. Knowing what information is available and where it’s stored can be like finding a needle in a haystack because there’s so much data, held in so many places. Finding relevant data often relies on word of mouth conversations with colleagues, which is limiting in scope and not sustainable over time. Plus, there’s potential for the same data to be captured multiple times, which can come at great expense. For example, in this cost-benefit analysis for FAIR research data, it’s estimated that €10.2 billion per year is lost to research funding providers by not making European research data FAIR.
FAIR data benefits
FAIR data principles increase the time and effort burden on data publishers because they recommend providing rich descriptions of information in both human and machine readable formats. However, this extra effort brings significant benefits. Here are some of the key payoffs in adopting FAIR data as an approach:
-
Enabling more people to discover and use data.
-
Opening up machine access to data, which supports consumers who want to use it for innovative applications.
-
Within the context of Open Government data, it provides access to an increased amount of datasets. This supports both scrutiny and discourse around government decisions, as well as providing data to inform future decision making.
-
Laying the foundation for Reproducible Analytical Pipelines because it recommends each dataset has a unique, global, persistent identifier, as well as clearly defined structure and suitability for machine processing.
-
Emphasising reusability, saving time for both data publishers and consumers.
-
Making metadata more available to give data consumers context for the information.
Who’s using FAIR data?
Over the years, there have been a lot of activities to create data portals to make data FAIR. Within the public sector, these include specialist portals such as Scotland's official statistics, which has been driving up the provision of high quality metadata with datasets to enable better interpretation of their content. Similarly, pharmaceutical companies have been early adopters of FAIR data, incorporating it internally, which has led to increased knowledge sharing within companies and faster drug discovery.
The UK Government’s 2022-25 Roadmap for Digital and Data includes a mission on “Better data to power decision making”. As part of this mission, there’s a Cross Government Data Marketplace which exists to promote the reuse of internal government data across departments. It requires that data resources are findable by members of other departments and early discovery and prototype work on this is already underway.
Common misconceptions
-
Data doesn’t have to be openly available to adopt the FAIR data principles. A FAIR data approach can be applied within organisations for internal data or for information that is sensitive in its nature. Such datasets can still benefit from being richly described to enable others with suitable access rights to be able to find, access, interpret, and reuse the data.
-
The principles are not a checklist (set of metrics) against which you can compare published data, although there are attempts to define metrics to enable data publishers to measure how FAIR their digital information is (Metrics for FAIRness).
-
The principles do not prescribe a set technology or methodology for making data FAIR. There are different ways organisations can do this, such as through Linked Data.
Further Reading
Below are some links to useful resources for finding out more about the FAIR data principles and how to incorporate them into your data management.
-
GO FAIR is an organisation that promotes the principles and fosters networks for understanding FAIR within specific domains. This page states each of the principles and its criteria.
-
The Turing Way handbook is a guide for researchers and data scientists to ensure that they perform reproducible and ethical analysis. They have an overview article on the FAIR data principles and how to apply them to provide reproducible analyses.
-
The FAIR Cookbook is an online guide with easy to follow recipes to help you understand how to apply the principles to your own data. This article gives their introduction to the FAIR Data Principles.
-
The FAIR Data image used above was created by Sangya Pundir, CC BY-SA 4.0 via Wikimedia Commons. Licence information here.
Transforming archiving through artificial intelligence
How AI can turn archives into living resources that shape the future while preserving the past.
Read moreOur latest insights
Transformation is for everyone. We love sharing our thoughts, approaches, learning and research all gained from the work we do.
Transforming archiving through artificial intelligence
How AI can turn archives into living resources that shape the future while preserving the past.
Read more
Making data deliver for public services
The government must prioritise sharing, AI management, and public trust to reach its data potential.
Read more
Keeping your systems on track with digital MOTs
Outdated tech can hold back organisations. Learn how digital MOTs can assess and future-proof your systems.
Read more