Summary: priority datasets
Table 4. Summary of priority datasets for building an anti-corruption data infrastructure
Dataset
Anticorruption category
Type of dataset
Interest declarations
Individuals and organizations
Register
Lobbying register
Individuals and organizations
Register
Company register
Individuals and organizations
Register
Charity register
Individuals and organizations
Register
Politically exposed people's list
Individuals and organizations
Register
Public officials register
Individuals and organizations
Register
List of government contractors
Individuals and organizations
Register
Corruption-sensitive postings
Individuals and organizations
Register
Council / advisory board members
Individuals and organizations
Register
Contracts register
Individuals and organizations
Register
Political parties finances
Public-related resources
Public disclosures
Budgets
Public-related resources
Public disclosures
Tender and award processes
Public-related resources
Transaction
Licenses, concessions and permits
Public-related resources
Transaction
PPPs
Public-related resources
Register
Spending
Public-related resources
Transaction
Government grants
Public-related resources
Transaction
International aid and financing
Public-related resources
Register
Audit data
Regulation, government procedures and records
Transaction
Voting records
Regulation, government procedures and records
Register
Court data
Regulation, government procedures and records
Transaction
Register of government projects
Regulation, government procedures and records
Register
Meeting records
Regulation, government procedures and records
Register
Records of changes in regulations
Regulation, government procedures and records
Register
Campaign promises
Regulation, government procedures and records
Register
Debarred or sanctioned contractors
Regulation, government procedures and records
Register
Public procurement complaints registers
Regulation, government procedures and records
Register
Land/Property register and cadastre (public land)
Rent Extraction
Register
Tax records
Rent Extraction
Register
Asset declarations
Rent Extraction
Register
For full data description see “Annex 1” or access: https://airtable.com/shrtE30MaSKb1sjko. The Anti-Corruption Open Up Guide
Availability challenges
In many countries, open data remains a very recent policy agenda. Since 2009, an increasing number of countries, regions and institutions have launched open data portals, yet few of the datasets crucial to combatting corruption are currently open by default. The 3rd Edition of the Open Data Barometer measures the availability of open data across 92 countries, and covers five key accountability datasets: corporate registers, government spending, land ownership, contracting and budget. On average, less than 10% of them were available as open data (see table 5).
Table 5. Selected statistics from the 3rd Edition of the Open Data Barometer
Dataset
Percentage as fully open data
Percentage available online
Corporate registers
1%
72%
Least open dataset in the world with just Australia publishing it as open data and only for very top level data for free. Looking at all datasets available, it is only accessible online in a machine readable format and for free in just a dozen of countries. The absence of adequate open data on companies makes tracking beneficial ownership challenging and hampers efforts to tackle corruption.
Government spending
2%
4%
Weakest dataset in the study. Even in the limited cases when it is available online, the data is usually not published at the transactional level. Only four countries – the USA, the UK, Japan and Brazil – publish spending data at the transactional level and from those only two, the UK and Brazil, release this information as open data. This makes it nearly impossible for government, citizens and civil society to tackle corruption
Land ownership
5%
46%
Rarely available online, difficult to find when available and quite frequently behind paywalls.
Contracting
8%
82%
Only 28% of the data available online is in machine-readable formats reducing practical accountability as this makes analysing the high volume of historical data very difficult.
Budget
18%
97%
Comparatively one of the better datasets in terms of availability and openness. In 95% of countries where it was available it was regularly updated. In several instances it is even required by law to be updated and available, although not necessarily open.
Source: World Wide Web Foundation. Elaborated for the Anti-Corruption Open Up Guide
Foundations of a solid anti-corruption data infrastructure
Joining up data and standards for anti-corruption
A solid anti-corruption data infrastructure can only be built when the relevant datasets can communicate with each other. The higher the number of connections, the better the chance of using the datasets to spot potential corruption red flags. Based on the priority datasets for building an anti-corruption data infrastructure (see table 4), a series of core data elements have been identified and have also been matched to available open data standards.
A data standard is a framework for how data should be collected and published, including how to describe individuals and organisations, how to register specific events or transaction and how to organize data to meet minimum quality requirements. Using a standardised approach means that different datasets can talk to each other. Moreover, the adherence to open data standards contributes in securing that a larger number of users can benefit from the data available.
It is desirable that both governments and civil society, review the existing availability of data and agree on a route map to disclose it as open data. At the same time, it is important to review how data is structured and assess if it needs to be restructured to meet open data standards.
Table 5. Summary of priority data standards for building an anti-corruption data infrastructure
Data guidance for disclosing public procurement data in open formats about contracting processes from planning to implementation stage. Extensions for other types of contracting such as public private partnerships and concessions are under development. More information: http://standard.open-contracting.org/
Open Contracting Partnership(CSO)
Schema for publishing and consuming fiscal data, especially data generated during the planning and execution of budgets. It supports data on expenditures and revenues. More information: http://specs.frictionlessdata.io/fiscal-data-package/
Open Knowledge Foundation(CSO)
Popolo is an initiative on open government data specifications. Its goal is to "define data interchange formats and data models so that organizations can spend less time transforming and modeling data and more time applying it to the problems they face". It allows standardization of data related to people, organizations, motions and voting, events, speeches, among others. More information: http://www.popoloproject.com/
An open schema under development for collecting and publishing beneficial ownership data globally. It will enable users to register in a standardized way data about the ultimate beneficiary or owner of a certain good (such as land) or an organization or entity (such as companies) across different countries. More information: http://openownership.org/get-involved/
Open Ownership (Global Coalition)
Schema for publishing and consuming data on companies worldwide, including data on jurisdiction, incorporation date, shareholders and subsidiaries. It recently incorporated beneficial ownership data released by the UK Government. More information: https://github.com/openc/openc-schema
Open Corporates (Private firm)
The Anti-Corruption Open Up Guide
Box 6. The G20 Open Data Portals: enablers of Anti-Corruption Data?
The Anti-Corruption Open Up Guide
Last updated