Acknowledging the way corruption works, we have identified datasets relating to each of the core elements of a corruption network: a group of individuals and organizations, organized through a series of agreements and schemes – in some cases violating laws and government procedures– to extract certain rent from the public or obtain an undue benefit for a private gain (see table 4).
These datasets form a basic core that countries should strive to make available and interoperable. The approach is of course a general one as what data is available, and in what format, will vary from country to country, and case to case. Nor is this list definitive: there are many other datasets that can be relevant to specific anti-corruption efforts. However, together these datasets form the basis of a solid anti-corruption data infrastructure.
Table 4. Classifying anti-corruption data
Core element of a corruption network
Description of the related data to the core element
Examples of datasets
Individuals and organisations
Refers to any dataset containing records and information on entities (individuals or organisations) that can be potentially involved in a corruption scheme. Datasets under this category should provide information about the nature and characteristics of any entity, as well as its connections with others.
– Lobbying registers– Company registers– Interests registers– Politically exposed people registers– Advisory boards– Government contractors– Public servants directories– Charity registers
Refers to any dataset containing records and information on the resources which belong to governments or are intended for public purposes and that could be involved in a corruption scheme. Datasets under this category should provide information about the status and transactions related to those resources.
– Budgetary datasets– Government spending– Contracts– Public-private partnerships– Political financing– Licenses and permits– Grants and scholarships– Auditing datasets– International aid, funding & technical assistance register
Regulations, rules and government procedures
Refers to any dataset containing records and information on the channels used, avoided or violated to commit an act of corruption by a corruption network. Datasets under this category should provide information about the procedures, events and legal acts potentially linked to corruption schemes.
– Voting records– Meeting records– Court records– Campaign promises
Refers to any dataset containing records and information on the use of public resources that were potentially extracted as a result of a corruption scheme. Datasets under this category should provide information about the income sources and ownership of the assets owned by members of a corruption network.
– Assets declarations/registers– Cadastre (including public land)– Land and property registers– Tax databases– Customs data
The Anti-Corruption Open Up Guide
These datasets take many different forms. Data may be drawn from public registers created to serve broad public functions, or developed with specific transparency and anti-corruption goals in mind. It may be transactional data generated during the daily operations of government, and released in as close to real-time as possible. Or it may be drawn from public disclosures mandated by law or policy.
Governments manage many different registers: from company registers, and land ownership registers, to lists of registered lobbyists, or lists of public servants.
The UK Government Digital Service (GDS) describe a register as “...an authoritative list of information you can trust”. This is an ideal. Every effort should be made to ensure government registers are authoritative. GDS have developed principles for public registers, and an open source software stack that provides open APIs for access to ‘living registers’.
However, sometimes government registers are not kept up to date, or they are maintained in non-interoperable and error-prone ways. This can lead to third-parties maintaining their own open data registers based on aggregating together and checking on the quality of government provided data.
Box 5. Registers: Every Politician
EveryPolitician.org is an independently maintained datasets with the goal of providing “data about every national legislature in the world, freely available for you to use”. Using the Popolo standard to manage data, its dataset is populated by a mix of ‘screen-scraping’ official resources, and crowdsourcing information. If governments provide official registers of political figures, then the EveryPolitician Bot can more easily keep the platform up to date.
The Anti-Corruption Open Up Guide
Every day hundreds of land deals take place; thousands of government tenders are issued, and contracts signed; and millions of payments may be made to and from government.
Hidden within these transactions may be red-flags for corruption, or information that, when linked with information from a register, could show illicit benefits received by a government official.
Transaction data can be made available in real-time through APIs, or provided periodically in bulk downloads. Timeliness and disaggregation can be an important factor in the use of transactional data, but care must also be taken to respect privacy.
Box 6. Transactional data: Brazil’s transparency portal
Brazil’s Transparency Portal provides detailed data on five key categories of transaction: (1) Direct spending by federal government agencies through contracts and tender processes; (2) All financial transfers to states, municipalities and the federal district; (3) Financial transfers to social program benefactors; (4) Administrative spending, including staff salaries, staff travel expenses and per diems and office expenditures; and (5) Information on all government official credit card spending”. Some transactional information is updated on a nightly basis. The portal has over 900,000 unique visitors per month.
Source: http://odimpact.org/case-brazils-open-budget-transparency-portal.html. The Anti-Corruption Open Up Guide
Transparency policies often create an obligation on public bodies, public figures or private entities to disclosure information. For example, disclosing a record of meetings between lobbyists and officials, or publicly posting voting records. Sometimes this information is recorded in registers, but often the obligation is worded so that bodies post their own disclosures on local notice boards, websites or in gazettes.
Frequently such disclosures are made in non-standard formats, in word processed documents, making it difficult to join up this information to other datasets. If standard formats were used, and data was more easily discoverable, the anti-corruption value of these disclosures could be increased.