The five stages to data sharing: Denial

Applying the Kübler-Ross model [1] to researchers and data sharing, based on various attitudes and comments we have encountered over the years. Don’t take the presentation seriously, but take the content seriously. Part one in a series of…uh, five.

1. Denial

Symptomatic statements: “No way! My data on public attitudes towards the weather is incredibly sensitive and potentially disclosive”; “Why? Why would anyone want to use my data anyway?”

Denial is usually only a temporary defence for the researcher. This feeling is generally replaced with heightened awareness of data and research assistants that will be left behind at the end of a project and inevitable data loss on a memory stick somewhere in the future, or “Shit! A hard drive shouldn’t be making a noise like that”, or “did I copy it off my computer before IT came and replaced it?”, or “ERROR: The format of the file cannot be read.” Denial can be conscious or unconscious refusal to accept facts, information, or the reality of the situation. Denial is a defence mechanism and some people can become locked in this stage.

Data sharing is here, and it’s not going away. This means managing data to meet discipline norms, legal, technical, and funding requirements for sharing and to ensure data can be preserved for the long-term.

Denial, isn’t just a hoary old quote about a river in Egypt, nor is it a viable option for researchers when it comes to data sharing. While reusing jokes and tired old quotes is – rightly – looked upon with disapproval, reusing data or making data available for reuse is supported to the point where refusal to share with peers is no longer acceptable unless a compelling case against sharing is made. Furthermore, the case against sharing is expected to be presented prospectively rather than retrospectively. Of course, such cases do exist: research in highly sensitive commercial or political fields, with exceptionally vulnerable participants, or cases where complicated intellectual property rights make it difficult to licence data for reuse. But these are exceptions, not the norm. The expectation is to share data with the widest possible community.

Public bodies or funders of academic research have already adopted some form of data sharing requirement or strongly encourage data reuse. The OECD[2], The White House[3] and various U.S. Federal agencies and funders[4], the European Commission[5], and academic research funders in the UK,[6] Germany,[7] and Australia[8] have adopted statements and/or requirements that research they fund should be available to others in usable formats with contextual information that make the data comprehensible. In addition, academic journals are adopting policies that data based articles be made available to potential users either as a condition of publication or to be made available within a period following publication[9]. Universities themselves are also adopting data management policies[10]. The head in the sand is not an excuse.

The data sharing phenomenon is partly politically motivated, predicated on the idea that data is a non-rival and (up to a point – when open to the fullest possible extent) non-excludable public good. There is also a related efficiency argument that taxpayer investment in research should not fund additional data collection when existing suitable data is already available[11]; thereby wholly exploiting the value of data – an attractive idea in an age of significant pressures on public spending. Partly it is a normative argument: good science should be transparent and replicable.[12] Data sharing stimulates discussion about the quality of data and the reliability of findings, research methodologies, assists in teaching[13], seeds further research[14], and in worst cases, polices unintentional errors[15] or fraudulent research findings[16].

The days when the use and value of data expired with the arrival of that acceptance letter from the Journal of Tenure Securing Research are over. Now data can be used for different types of research. It can be used for replicating previous studies[17] or re-purposed and integrated with other data[18]. Data can yield insights into phenomena long after its original collection, and can feed future research questions and innovative collaborations[19].

Why would people use it? Well, like old quotes or comedy sketches, a good publication is expected to be cited. Creating a good, well documented, accessible, data set will also lead to reuse and citation, giving the data and the researcher’s name an active life-span beyond the original research project. And not to be too morbid about it, maybe beyond the researcher’s active life span too, before they go to the great archive in the sky. For as a professor once said, “In the long run we are all dead”[20], but there’s no reason why your data should also pass on, be no more, cease to be, expire, go to meet its maker, stiff, be bereft of life, rest in peace, push up daisies, kick the bucket, shuffle off its mortal coil, run down the curtain and join the choir invisible.[21]

About CESSDA Training

CESSDA Training offers and coordinates training activities for CESSDA, the Consortium of European Social Science Data Archives ( Hosted by the GESIS - Leibniz Institute for Social Sciences, our center promotes awareness throughout the research lifecycle of good research data management practice and emphasizes the importance of long-term data curation.
