Overview: Data against corruption networks
Core data for setting an anti-corruption data infrastructure
Acknowledging the way corruption works, we have identified datasets relating to each of the core elements of a corruption network: a group of individuals and organizations, organized through a series of agreements and schemes – in some cases violating laws and government procedures– to extract certain rent from the public or obtain an undue benefit for a private gain (see table 4).
These datasets form a basic core that countries should strive to make available and interoperable. The approach is of course a general one as what data is available, and in what format, will vary from country to country, and case to case. Nor is this list definitive: there are many other datasets that can be relevant to specific anti-corruption efforts. However, together these datasets form the basis of a solid anti-corruption data infrastructure.
These datasets take many different forms. Data may be drawn from public registers created to serve broad public functions, or developed with specific transparency and anti-corruption goals in mind. It may be transactional data generated during the daily operations of government, and released in as close to real-time as possible. Or it may be drawn from public disclosures mandated by law or policy.
Governments manage many different registers: from company registers, and land ownership registers, to lists of registered lobbyists, or lists of public servants.
The UK Government Digital Service (GDS) describe a register as “...an authoritative list of information you can trust”. This is an ideal. Every effort should be made to ensure government registers are authoritative. GDS have developed principles for public registers, and an open source software stack that provides open APIs for access to ‘living registers’.
However, sometimes government registers are not kept up to date, or they are maintained in non-interoperable and error-prone ways. This can lead to third-parties maintaining their own open data registers based on aggregating together and checking on the quality of government provided data.
Every day hundreds of land deals take place; thousands of government tenders are issued, and contracts signed; and millions of payments may be made to and from government.
Hidden within these transactions may be red-flags for corruption, or information that, when linked with information from a register, could show illicit benefits received by a government official.
Transaction data can be made available in real-time through APIs, or provided periodically in bulk downloads. Timeliness and disaggregation can be an important factor in the use of transactional data, but care must also be taken to respect privacy.
Transparency policies often create an obligation on public bodies, public figures or private entities to disclosure information. For example, disclosing a record of meetings between lobbyists and officials, or publicly posting voting records. Sometimes this information is recorded in registers, but often the obligation is worded so that bodies post their own disclosures on local notice boards, websites or in gazettes.
Frequently such disclosures are made in non-standard formats, in word processed documents, making it difficult to join up this information to other datasets. If standard formats were used, and data was more easily discoverable, the anti-corruption value of these disclosures could be increased.