Research data and the public good
Researchers in economics and the social sciences often complain that they have limited or no access to high-quality public-sector data, such as those collected by National Statistical Institutes to build governmental statistics and write the reports that underpin the implementation of public policies. To be sure, the situation is improving in several countries at least in Europe, and I am myself involved in a FP7 project (DwB) that aims to harmonise and simplify data access across countries. And, some National Statistical Institutes already have a strong tradition of allowing researchers to access their data -in particular the British ONS and the French INSEE.
Yet until recently, researchers themselves rarely made their own data, metadata and/or source code available to others – and journals tended to ask only for findings, not the data on which they were based. Wouldn’t data release be a condition of scientific integrity, not to mention a way to serve the public good? Open information and open access, it is increasingly widely agreed, are conditions of good governance and along these lines, research should participate in this trend.
Things are changing, though slowly. CESSDA archives in Europe, and their equivalent in other countries (ICPSR in the US in particular) offer platforms for researchers to share their data. Journals, also, have started participating in this trend, though slowly and motivated by different reasons. American Economic Review, n. 1 journal in economics, started in 2003 encouraging its authors to make available their data, programmes and sets of intructions used to analyse them; the policy became mandatory a few years later. (Their data policy is available here; a report has been written on compliance last November). Other top economics journals have followed suit, most prominently Econometrica, Economic Journal, Federal Reserve Bank of St. Louis Review, Journal of Applied Econometrics, Journal of Business and Economic Statistics, Journal of Money, Credit and Banking. They generally offer a space on their websites where the data should go, but sometimes allow authors to post their data in a website or repository of their choice, provided they are widely available. I know less well the policies of journals in other social sciences, but for example, Sociological Methods & Research also requires authors to make their data available.
(It goes without saying that any personal data must be anonymised, and can be replaced by just precise instructions on
how to access them if they are the property of someone else, e.g. a National Statistical Institute, or if they are highly sensitive/confidential.)
Although such a data policy is not yet mainstream, the fact that top journals are leading the way suggests that it is likely to become more widespread in the years to come. Journals are moving in this direction mainly for purposes of replication (and therefore, scientific integrity) rather than public usefulness, it must be said; but still, this can contribute to achieving openness as a form of public good.
National Statistical Institutes and other providers of data useful for social science research may consider encouraging this trend, asking researchers to make available their data, metadata, and source code through some openly accessible website, institutional repository or data archive, provided anonymisation is ensured and all necessary measures are taken to protect confidentiality. That would accompany and support a trend that is already in place, shifting the focus a little bit from just replication per se to the wider issue of public usefulness of data and research.
Filed under: Research, Social science methodology | Leave a Comment
Tags: Data access, Governmental data, Open access to scientific publications, Quantitative methods, Research ethics, Social science data, Statistical modeling


No Responses Yet to “Research data and the public good”