summary • coresoi

The objective of this summary is to synthesize the methodological choices taken within the CO.R.E project to the final aim of validating the statistical procedure to develop a composite indicator of corruption risk in public procurement in emergency.

A major challenge in the measurement of complex and latent phenomena such as corruption and corruption risks consists in summarising information available from a set of single indicators (i.e. red flags) into a single metrics, such as a Composite Indicator (CI) of corruption risk. CIs are a useful communication tool for conveying summary information in a relatively simple way. They are used widely in various sectors in public services as tools in policy analysis and public communication to compare units of analysis (Countries, regions, contracting bodies etc.). Constructing a CI is a difficult task and full of pitfalls: from the obstacles regarding the availability of data and the choice of the individual indicators to their normalization, aggregation, weighting and validity check. Currently, both at the academic and the policy level, little energies have been devoted to developing a robust synthetic/composite indicator of corruption risk in public procurement, as an aggregated measure of single red flags. Among them, the proposal of Fazekas and colleagues (Fazekas et al., 2016) stands out. In this work, the authors develop a composite score of tendering red flags, the Corruption Risk Index (CRI), as a proxy measure of high-level corruption in public procurement, derived from public procurement data from 28 European countries for 2009-2014. A similar study is carried out by Troìa (Troìa, 2020) with data and red flag indicators developed from the Italian National Dataset of Public Contracts (Banca Dati Nazionale dei Contratti Pubblici). Further, in the Single Market scoreboard Initiative, the single indicators (i.e., red flags) are aggregated by summing them to show how different EU countries are performing on key aspects of public procurement. Composite indicators of corruption risk in public procurement developed currently by researchers and experts are based on a selection of most common elementary indicators or red flags of corruption risk and rely on a rather simple methodological base. On one hand, common red flags need to be re-thought to be effective in assessing corruption risk in public procurement over crises. As known, red flag indicators may suffer from overestimating corruption risks, because there can be non-corrupt instances where red flag indicators signal risk, giving rise to false positives. This main drawback of red flags is amplified over crises. The crisis introduced by the Covid-19, for example, taught us that public procurement systems react to emergency situations through the adoption of relaxed regulatory frameworks. Under these latest contexts, relying on common red flags (such as those accounting for the proportion of bids adopted through an exceptional procedure and/or very rapid bidding procedures) is likely to lead us to overestimate corruption risks, as high values for these elementary indicators may mirror legitimate procedural choices allowed by a relaxed regulatory framework rather (or other) than an actual high level of corruption. On the other hand, actual composite indicators of corruption risk are obtained as a simple un-weighted arithmetic mean or sum of individual risk indicators. Summation reflects the view that different combinations of elementary corruption techniques can equally identify a contract as corrupt. However, as sophisticated actors can achieve corrupt control of a tender even with the recourse to a single corruption technique, CIs obtained as a simple mean or sum of elementary indicators represent only a lower bound estimate of corruption risk. This is an important source of limitation of current CIs, as the same proposing Authors (Fazekas et al., 2016) suggest. Besides, current proposals rely on linear aggregation methods, which are just one among many possible choices with different methodological and substantial justifications and different application potential. For instance, linear aggregation methods are worthwhile when all single indicators have the same measurement unit, but geometric aggregations are better suited if some degree of non-compensability between individual indicators or dimensions is required. Furthermore, the current literature lacks a validation procedure of the proposed composite, reporting a sensitivity analysis of the composite indicator with the aim of verifying its robustness.

Validating a statistical procedure to develop a composite indicator of corruption risk in public procurement in emergency is a complex task involving several stages: i. data selection ii. selection of elementary indicators (i.e. red flags of corruption risk); iii. normalisation methods; iv. weighting and aggregation schemes; v. Multivariate analysis for the study of the data structure; vi. Sensitivity analysis.

Data selection and architecture: A solid data infrastructure is utterly important to be able to deal with massive amount of information and with their usual characteristics such as the 3 Vs in big data, i.e. Volume, Velocity and Variety. To this aim, the CO.R.E. project exploits both opensource software and cloud infrastructure to be able to deal with big data in emergency scenarios, where the Q standing for Quickness is an essential component. By doing so, we develop a data pipeline, extracting data from the official open source data portal from ANAC (both for cigs, i.e. traditional tenders and smartcigs, simplified tenders), then we rehydrate its data with L190/2012 and proprietary, confidential and tremendously informative data coming from reliable sources. In the end, we calculate indicators whose results are going to be stored in a further space with the purpose of distributing them to the frontend i.e. the CO.R.E. dashboard.
Choice of elementary indicators: Companies and contracting authorities at risk of being corrupt across crises are identified through a set of red flags only in part derived by the current literature and computed through a novel procedure which exploits the time discontinuity introduced by a crisis outbreak and the possibility to distinguish two time-spans, a pre-crisis and a post-crisis period. The approach compares, whenever possible, company and/or contracting authority behaviours after the crisis outbreak with respect to their historical behaviour and assesses the associated risk through statistical testing. The proposed procedure is extensible to other crisis contexts, replicable to other national contexts, and adjustable on account of different market trends across the various crises.
Normalisation methods: Normalisation has the aim of taking all individual indicators to the same scale and to the same (usually positive) polarity. To this end, several methods can be employed such as standardisation, categorization, rescaling, ranking, and indexing.
Weighting and aggregation schemes: Once single indicators are normalised, they need to be combined to obtain the CI. To this end, a weighting system and an aggregation scheme should be chosen. The former implies the scale of importance of each individual indicator, while the latter identifies the technique (compensatory, partially compensatory or non-compensatory) for summarising the individual indicator values into a single number. We briefly review the most relevant alternatives and combinations thereof available from the literature.
Multivariate analysis for the study of the data structure: The CI of corruption risk is the result of a transformation of available red flags in a unique measure. However, different dimensions (represented by groups of red flags) might be realistic, which characterise the phenomenon at issue from different perspectives. Several statistical tools available in literature for the dimensionality assessment of a set of elementary indicators are therefore presented.
Sensitivity analysis The choice of a normalisation method, of a weighting and an aggregation scheme does have consequences on the final value of the CI and, therefore, on the final ranking of units with respect to corruption risk in public procurement in emergency. To this end, a sensitivity analysis needs to be undertaken to assess the robustness of the CI in terms of the mechanism for including or excluding single indicators, the normalisation method, the choice of weighting and aggregation scheme.s