A commonplace private occurrence can sometimes reveal a pattern in a set of apparently unrelated public and private events. Two Princeton graduate students have been trying to replicate an unpublished but much-cited and widely discussed study with implications for public policy in the United States. After a good deal of work on their own, the students approached the author of the study, who is something of a young star, and whose web site lists the paper as forthcoming in the American Economic Review, and politely asked for access to the data. Recall that 'it is the policy of the AER to publish papers only if the data used in the analysis are clearly and precisely documented and are readily available to any researcher for purposes of replication.' The reply: 'I do not give out data-sets because I am asked for my data literally hundreds of times each year, and the burden of supporting requests is excessive. I like to think of my policy as an encouragement to others to compile original data of their own. If everyone in the community compiles original data, everyone can hope to share the benefits. If not, the equilibrium will be that no ones compiles data because of the free-rider problem.' This is certainly notable for its forthright honesty, and seems not to be a particularly unusual example.
On two recent occasions, colleagues have requested access to data that were collected (some years previously) using public funds, and where the conditions of funding, like the policy of the AER, made clear the responsibility for public access. In both cases, access was denied because the data were not yet sufficiently 'clean.' In one case, and on the very next day, the excluded researcher was asked to referee a paper in which the 'unclean' data were used by the proprietor of the data, but it is rare that the opportunity for revenge comes so quickly. One private foundation that funds research on health policy withholds a fraction of grants to researchers until data are shared; the funds are frequently left unclaimed. In another case, a major foundation agreed to support data collection on condition that the data be temporarily withheld from all but minority researchers.
Encouraged by changing sources of funding and the policy issues of the day, and enabled by their strong statistical and modeling skills, economists have been working on a range of (once) non-standard topics, such as education, crime, and health. As they do so, their policies for sharing their data are unlikely to remain a matter for idiosyncratic private decision, or for largely toothless monitoring by journals and professional associations. Richard Shelby, Senior Republican Senator from Alabama, attached (without hearings or debate) a rider to a 4,000 page appropriations bill that was duly signed into law. By this rider, all federally funded data, including possibly research notes, email, and correspondence, produced by researchers in universities (but not by corporations) are subject to the Freedom of Information Act (FOIA), and so can be requested by any concerned citizen. The immediate impetus for the senator's action appears to have been the Environmental Protection Agency's (EPA) citing an academic study of the effects of soot emissions when imposing new standards (opposed, among others, by the Alabama Power and Light Company.) The data underlying the study have been released neither to the public nor even to the EPA.
The US Chamber of Commerce has hailed the Shelby provision claiming that 'In the regulatory reform area, there may never be a more important issue. . . This would be the first time the business community has ever been provided with the basis for the bureaucracy imposing $700 billion of annual regulation costs upon us.' Academic researchers, including many economists, have protested and worked to repeal the Shelby provision, an effort led by Princeton's local (Democratic) Congressman. There has been discussion of the case of a medical researcher whose work uncovered that six-year old children were more familiar with Joe Camel and the associated cigarette brands than with Mickey Mouse. The R. J. Reynolds Tobacco Company took the researcher and his university to court in an ultimately successful attempt to obtain the names and addresses of the children, so that they could be 're-interviewed.' Under protocols currently under discussion, such cases would (presumably) not occur under FOIA, because identifying information would be removed by bureaucrats prior to release. But even otherwise sympathetic researchers worry about lawyers and publicists casting methodological decisions as scientific misconduct. It is one thing to be pilloried by a referee for using OLS when GLS would have been better; it is quite another to be sued by a well-funded corporation which has requested your university to terminate your employment, and which has hired your referee as an expert witness.
That government should become involved in these issues is perhaps not surprising given the size of the stakes, the failure of academics (including economists) to police themselves effectively, and the seeming increasing irrelevance of peer review procedures. Bruce Alberts, President of the National Academy of Sciences, argues that researchers should be able to hold back their data until the work has appeared in a peer reviewed journal, and are only then required to release them for replication. But given the quality of much that is published in both fields, it is hard to believe that peer-review in either economics or public health can bear the burden of certification. When the results of working papers posted to the web instantaneously become part of the policy debate, traditional procedures hardly seem adequate. (Witness the firestorm from both right and left over the finding by John Donohue and Steve Levitt at the University of Chicago that much of the decline in American crime rates can be traced to selective abortion of would be criminals, a sort of pre;emptive capital punishment.) Indeed, at least one economist has argued in favor of the Shelby provision. Robert Hahn of the American Enterprise Institute cites the AER's policy quoted above as an example of good scientific practice, argues that business interests are entitled to access to data that affect them, and worries only that unscrupulous scientists (Princeton graduate students?) might 'FOIA' other people's databases and use them for their own research.