Consider the following example, taken from Improving government efficiency through mechanism experiments by Jens Ludwig and Sendhil Mullainathan at VoxEU.org:
Suppose that the US Department of Justice is interested in learning more about whether to devote scarce resources to supporting ‘broken windows’ policing, which is based on the notion that signs of minor disorder signal to potential offenders that no one cares about crime in the local area, thereby reducing the deterrent threat from punishment and increasing the chances that more serious crimes are committed. Most researchers would argue that the best approach is to carry out a policy evaluation of broken windows policing. Recruit a representative sample of cities, and randomly select some neighbourhoods but not others (or perhaps some cities but not others) to receive broken windows policing. Then compare subsequent crime rates in treatment versus control areas. This policy evaluation would be informative but not cheap. The unit of random assignment in this case is the neighbourhood or city – the level at which the policing intervention of direct interest operates. The number of neighbourhoods or cities that would need to be ‘treated’ to have adequate statistical power is large, and the cost per treated area is high.So rather than having to test the actual policy directly you may be able to, at lower cost, test the causal mechanism that underlies the policy.
Now consider an alternative experiment. Imagine buying a number of cheap used automobiles. Break the windows of half the cars, and then randomly select a set of urban neighbourhoods in which to park cars with different levels of physical damage. Measure what happens to more serious crimes across different neighbourhoods. While less ethically objectionable variants of such an experiment are possible (such as randomising areas to have signs of disorder cleaned up, rather than introduced), our example is basically the research design used in a social psychology experiment in the 1960s that led to broken windows theory and then widespread adoption in New York City in the 1990s. This ‘mechanism experiment’ doesn’t test the policy of direct interest to the Department of Justice, but rather tests the causal mechanism that underlies the broken windows policy.
How can mechanism experiments help economise on research funding? The broken windows theory rests on a simple logic model in which the key policy lever, P (broken windows policing), influences the outcome of primary policy interest, Y (serious criminal offences), through the mediator (M) of local disorder, or PàMàY. Suppose that DoJ thinks it already knows something about policing – specifically, suppose that DoJ thinks it already understands the relationship between broken windows policing and signs of disorder (PàM). Police professionals might need to have learned that relationship to guide all sorts of policing decisions, because citizens dislike disorder for its own sake regardless of whether it accelerates more serious crimes. In that case the new information that DoJ gets from carrying out a policy evaluation of actual broken windows policing is just about the MàY link, but that information is mixed together with the noise about the specific PàM link that would arise in any given experiment. On the other hand the mechanism experiment maximises the research funding available to identify the part of the causal chain (MàY) that policymakers do not already understand. Put differently, mechanism experiments can economise on research funding by taking better advantage of what policymakers think they already know.
This broken windows example is not an isolated case. Depending on what policymakers think they understand, in other applications mechanism experiments might increase the efficiency of research spending by, for example, enabling researchers to randomise at relatively less aggregated (lower-level) units of observation.
Ludwig and Mullainathan go on to say,
We are not claiming that mechanism experiments are ‘better’ than policy evaluations. In situations where, for example, the list of candidate mechanisms through which some policy might affect outcomes is long and these mechanisms might interact, the only useful way to get policy-relevant information might be to carry out a policy evaluation. Probably more common is the situation in which mechanism experiments and policy evaluations are complements, in which encouraging evidence from a mechanism experiment might need to be followed up by a policy evaluation in order to, for example, reduce the risk of unintended consequences. But at the very least carrying out a series of mechanism experiments first can help improve decisions about when it makes sense to invest research funding in a full-blown policy evaluation.