From data gathering, to data use

Last updated 5 months ago

Unlocking resources for anti-corruption action

Many of the pioneers of data-driven anti-corruption work have not been able to draw upon proactively published open datasets. Instead, they have had to gather the data they need through Freedom of Information requests, scraping data from inaccessible websites, and, in some cases, working with leaked datasets. Where data has been available, it has often been low quality, requiring substantial investments of time and effort before it can be used - and limiting the extent to which tools from one country can be used in another.

in this section we detail a number of cases where different stakeholders are working with the datasets described in the last section - either directly from open data, or using data they have manually gathered. The more governments move towards proactive publication, the more use-cases like these can spread, and effort can go into data use, rather than data gathering.

Over the coming year we hope to revise this section with additional cases that demonstrate direct use of open data - as governments deliver on their commitments to provide structured open data for anti-corruption.

Case study: The Panama Papers

The Panama Papers are an unprecedented leak of 11.5 million files from the database of the world’s fourth biggest offshore law firm, Mossack Fonseca. The leaked files reveal information on more than 214,000 offshore companies, connected to people in 200 countries and territories. The data includes emails, financial information, and corporate records that in some cases link world leaders and other prominent figures to illicit activity.The International Consortium of Investigative Journalists worked with the leaked documents, and imported structured data extracted from them into a graph database, providing this to a network of 100s of investigative journalists. This made it possible to find leads in the dataset, and to follow up potential stories. The investigations and stories from analysis of this data have led to multiple resignations and prosecutions.Although the Panama Papers dataset itself was not open data, published by a government, the investigations that followed demonstrated the investigative potential of corporate ownership data - and the value of having linked and structured data, as opposed to just documents.Crucially, open data did play an important role in the follow up Panama Paper news stories. Open Corporates, who host open data on millions of companies and shareholders worldwide, reported a substantial spike in searches from countries where political leaders were implicated in offshore company scandals - revealing citizen interest in finding out more about their politicians business dealings.

The Anti-Corruption Open Up Guide