Last week, I spent two late summer days in Vienna at Open Access Tage 2012, the yearly Open Access conference for Germany, Austria, and Switzerland. Hosted by the University of Vienna, the event brought together a multifarious group of researchers, librarians, archivists, and repository managers as well as representatives from the research funding agencies and publishing houses to discuss the state of Open Access and related trends in scholarly communication. With a significant number of presentations and lectures dedicated to research data, the Vienna conference offered ample opportunity to immerse yourself in this topic and accordingly, this is what I did.
In retrospect, and in an attempt to systematize and sum up my (subjective and selective) impressions from the conference, I would say that three, strongly interconnected topics continued to crop up throughout the “research data track”:
- enhanced publications or “information in context,” linking textual publications with data and other relevant information;
- data as both a means and subject of quality assurance and transparency;
- ownership of data.
With these topics, a number of questions are associated – many of them revolving in some way or another around the concepts of “transparency” and/or “openness” – that we, the research (data) community consisting of archives/repositories (used synonymously in the following), researchers, publishers, and funders alike, will have to discuss and answer on our way towards establishing a culture of truly open research.
The question of data ownership was particularly raised during discussions of journal data policies in the wake of presentations in the research data session. While the copyright situation for research data is complex and varies from country to country, figuring out who owns a given dataset is ultimately a manageable task. What happens, however, when subscription journals with data policies make the submission of data along with a research article mandatory? Is there a danger of the data, like the article, becoming closed-access, leading to a trade-off between transparency and openness? No precedents were reported in the discussions, but it seems important to be wary of this potential issue and raise awareness among researchers – similarly as with publishing agreements – not to sign away their rights to the data in some way. There can be no transparency without openness, and limiting the access to data to achieve transparency is strongly contradictory.
The topic of data as a means of quality assurance and the role of data availability in the effort of increasing the transparency of research particularly emerged in presentations by Sven Vlaeminck (ZBW) and Andrea Smioski (WISDOM). Vlaeminck’s presentation hinted at another apparent and somewhat alarming conflict between transparency and openness. Thus research carried out in the EDaWaX – Europen Data Watch Extended project suggests that Open Access journals in economics are much less likely to have a data policy than subscription journals. They therefore not only run the risk of becoming the target of a debate about the quality of Open Access publications, they also forgo the opportunity of the “citation advantage” that publications associated with research data have according to Piwowar/Fridsma 2007. What might be worse, however, is that faced with this situation researchers have to choose between openness and transparency, thus creating a conflict/contradiction between two concepts that (in an ideal world, one might add) should be nearly synonymous.
Using qualitative research in the social sciences as an example, Smioski in turn highlighted that both the transparency of research and the quality of data strongly depend on high-quality and standardized documentation of data and the research process. Her argument came full circle in Heinz Pampel’s (GFZ) presentation during the APARSEN workshop, which took up a different vantage point and looked at research data as the subject of quality assurance procedures. Pampel in particular shed light on challenges associated with the implementation of a peer review system for research data from the perspective of researchers and journals, at the same time as reflecting on the role of repositories in this context (for more details, also see the APARSEN Report on Peer-Review of Research Data in Scholarly Communication). After an initial, criteria-based selection process, repositories take care of data curation, thus preserving and to some extent enhancing the quality of data by providing documentation and value-added services. True to the dictum “garbage in, garbage out,” however, repositories’ power to turn “garbage” into high-quality data is limited. Thus, researchers should involve archives regarding questions of data management as early in their research as possible to ensure that they create high-quality and re-usable data.
Taken together, these presentations drew a very differentiated picture of challenges to as well as first steps towards realizing the “digital utopia” envisioned by neurobiologist Björn Brembs in his opening keynote: “a single semantic, decentralized database of literature, data and software.” I caught glimpses of this utopia becoming reality several times, for example in the session dedicated to OpenAIRE and OpenAIRE+. Building on the existing OpenAIRE infrastructure, OpenAIRE+ will further enhance the access to research funded under FP7 by linking literature to project information and datasets. While the concept of “enhanced publications” is often used to describe the goal of such efforts, Matthias Lösch’s presentation evoked another – possibly more appropriate – term, namely that of “information in context.” The latter appeals to me because it moves the focus away from publications as the one and only vehicle of scholarly communication and reputation (of course, I am exaggerating here), allowing for other forms of research output (such as data sets, blog posts, visualizations, etc.) to move into the spotlight. This is nicely illustrated in the OpenAIRE+ prototype for the social sciences, implemented by the DANS institute in the Netherlands, which allows you to explore what could be called a “contextual map” of research output where (textual) publications are only one among many possible central nodes.
My take home message? There are many, I suppose – but among them is the renewed awareness of the importance of continuing to work on making (or keeping, depending on how you look at it) openness and transparency synonymous. And the conviction that, although we are not quite there yet, we are getting closer to realizing Brembs’ digital utopia.