What is open data and why is it important?
What is open data?
Open data is data that is available for everyone to access, use and share. It is generally published by governments on freely accessible portals and might include information about local areas, or statistics on topics such as the economy, health, and the environment.
Open data is also a movement. Supporters of open data believe that some kinds of data should be publicly accessible for anyone to make use of. It's a viewpoint that has accompanied the explosion of data in today's digital world.
With modern technology collecting more and better quality data on our world than ever before, it makes sense to open this up as much as possible for social, economic, public and institutional gain.
A history of open data in the UK
Open data was popularised by the Obama administration in the late 2000s. At this time, city authorities across the world began to look into publishing any data they had, that could be open — ie. that wasn’t personally or commercially sensitive — in a single place online.
The Greater London Authority, for example, created the London Datastore in 2010 as a repository for its data, along with that of other public sector organisations operating in the city. It now has 118 different data publishers, ranging from Transport for London to the London Fire Brigade.
Open data platforms such as this, therefore, give a wealth of information to any interested parties - from individuals to private businesses - about things like the demographic makeup of boroughs, transport use, and institutional spending.
This helps to provide transparency to the work of authorities in the public sector, opening it up to the media for scrutiny and investigation. It also encourages innovation. Open access to data is extremely important for researchers and innovators looking to firstly understand, and secondly, build solutions to the problems people face.
Why is data quality important in open data?
The Open Data Institute (ODI), co-founded by world wide web inventor Sir Tim Berners-Lee in 2012, is just one of the many organisations pushing the open data agenda. According to the ODI, open data is defined as 'data that is available for everyone to access, use and share.' But that's not all.
Open data should also be easy to access, structured, stored in a non-proprietary open format, clearly labelled, and linked to other data for context. Fulfilling all of these criteria represents the ideal for open data according to Tim Berners-Lee's 5-Star Open Data plan, a ranking system designed to improve the quality and useability of data.
Data quality and usability is an extremely important part of the open data conversation. As is becoming ever more widely accepted, more data doesn’t always equal better outcomes. In fact, when large amounts of data are collected and stored, it can cause severe problems, so it’s crucial to follow rigorous governance procedures and data standards.
What is public data?
The terms open and public data are often found together, and whilst they might sound similar, they are actually very different.
As we saw above, there are certain standards associated with open data that make it easy to both access and use. This isn’t the case with public data, which simply means any data that is in the public domain. Public data, therefore, includes datasets and documents that can only be accessed with a freedom of information request, as well as data that isn’t in machine readable format.
Public data is definitely not always open data.
The open data agenda
Advocates for open data continue to push for more government data to be made accessible, as across the world still fewer than one in five government datasets is open to the public.
Initiatives such as the Open Data Barometer — which tracks these trends – want to see progress in the form of making government data open by default; better data infrastructure and data management practices; and working with stakeholders to solve more challenges with open data.
We will also continue to see debate about data ownership and privacy increase over the next few years, as larger scale data projects are pursued by governments in an effort to link up and improve public sector services.
Building trust is key to the effective use of data in the public sector, with a 2019 ODI survey finding only 31% of people trusted government to use their data ethically. This is something that public bodies must address for society to reap more of the benefits of data. It's certainly on the agenda in government, with the National Data Strategy, for example, aiming to empower people to control how their data is used, but also 'to strengthen the existing understanding that aggregated data about people — used responsibly and fairly — can have public benefits for all.' Tied to this is the important task of improving transparency over the use of algorithms in big data, with the Data Strategy again promising to work closely with the Centre for Data Ethics and Innovation to develop the right kind of governance for these technologies.