Applying the Kübler-Ross model  to researchers and data sharing, based on various attitudes and comments we have encountered over the years. Don’t take the presentation seriously, but take the content seriously. Part two in a series of…uh, five.
Symptomatic statements: “I’m a researcher, not an archivist!”; “this is bureaucratic crap getting in the way of real research!”
Once in the second stage, the individual recognizes that denial cannot continue. Because of anger, the person is very difficult due to misplaced feelings of rage and envy. Anger can manifest itself in different ways. People can be angry with themselves, or with others (funders, universities, archives…etc.), and especially those who are close to them (funders, universities, archives…etc.). It is important to remain detached and non-judgemental when dealing with a person experiencing anger from data sharing requirements.
A lot of what researchers already do is active research data management and a pre-requisite to archiving. It’s just not seen as “research data management”, but simply “research”. Part of the reason resentment exists (and I don’t want to overstate the level of resentment but, yes, a little is out there) is a failure of archives, funders, and research institutions to communicate this message, as well failure to emphasise the benefits of research data management and sharing over requirements.
For example, we all need context to make sense of things, and we add that context often without acknowledging we are doing so: what is it, where is it located, who created it, why was it created, how was it created, and when was it created? Well, that’s metadata. We tend to provide our own answers to these questions if we are creating something. We construct a story or narrative. In this case, a narrative about data often contains variable names, variable labels, questionnaires, interview schedules, all of which contribute to the story of data. However, will others be able to understand that story, will they be able to get the sense of context? Will they understand original actions and intent? Would even the future self of the original researcher be able to understand? That’s before we get to the problem of computers trying to understand context.
That’s why archives invest time and money in standardised metadata. Metadata in its simplest definition is “data about data”, the description of what it is you’re collecting or analysing that completes the story of the data. Now, as mentioned, we can all apply our own narratives to the data, but problems emerge when trying to compare narratives. Is what I call green apple the same as what you call green apple? However, when we apply a narrative standard to this data about data we are making that story far more attractive. Attractive how? Well, we make it discoverable, searchable, predictable, navigable and more widely understandable. Essentially, we clearly and transparently define our green apples and cherries so we know what green apple is and what cherry is without having to ask the assistant. As an example, think about today’s date. I’m writing this in Germany on the 12-09-2013, but the 12-09-2013 in North America is 68 days away. To avoid the confusion we could adopt a metadata standard for dates. The ISO 8601  uses a consistent format: full year-month-day, so today is 2013-09-12. Therefore, if I apply ISO 8601 it is clear today is 12 September 2013.
However, rather than search around for all these standard formats, data archives are committed to promoting standard metadata schemes that tie together various fields of information into an overall standard. The social science community is adopting the Data Documentation Initiative (DDI) standard: a core set of information structured, interchangeable metadata that streamlines information based around the research life-cycle: conception, collection, and analysis, to archiving. To modify a famous old BBC disclaimer, “other metadata schemes are available”, but DDI is designed with a focus on research, especially social science research. Other metadata schemes are designed to serve different needs.
We’re not asking you to be the archivist, but we all quickly forget information unless it is recoded when, or very close to when, something happens. If metadata isn’t captured at the data creation stage, it becomes so much harder to provide context later on, and for all the work we do on standards and interoperability it remains the case that the best person to capture metadata is the researcher. Why? Because the researchers knows their data better than anyone. When? When they create the data, because they know that data better at this moment than at any time in the future. Even if we disregard other people using data, without applying research data management considerations could data be understood or accessible to the original researcher if it was wanted in five or ten years’ time? Just think of a box of unlabelled VHS tapes in the basement or attic, would you know what’s on them even if you originally recorded them back in the 1990s? Do you even have the means to watch them and find out? It’s the same with research data.
It may seem like a chore, but managing research data for long-term preservation and reuse is an integral part of research practice. In the same way household chores aren’t fun (well, not to most of us anyway) data sharing requirements aren’t there to make life more complicated and miserable. It just feels that way at the time. But would you want to live in a house where no chores were done? Having been an undergraduate student in a house share, I know my answer to this question.
 Adapted from http://en.wikipedia.org/wiki/K%C3%BCbler-Ross_model under a Creative Commons Attribution-ShareAlike 3.0 Unported License.