We spent the week in Ljubljana, Slovenia at the biennial conference of the European Survey Research Association. Our trip to Ljubljana began with an idea for a session. A session bringing academics and research data management people closer together based on a perception that researcher and archives sometimes fail to effectively communicate. And on the assumption that survey researchers are less-likely to come to one of our conferences, we decided to go to one of theirs.
Organizing a session around the title “Research Data Management for Re-use: Bringing Researchers and Archivists Closer” we were fortunate to be able to fill two sessions with quality papers on a range of RDM activities, from support and training to tools and outputs. What follows are summaries of those presentations, and they are my summaries too. This means all the misrepresentations and mistakes, slanderous accusations and outright lies, that follow are mine and not attributable to the presenters.
Session one began with Ingo Barkow (DIPF) presenting Rogatus, a framework for survey design tools. Rogatus intends to provide seamless movement of data from researcher to archive and reuse. Based on a conception of two portals: survey administration and research data management, it builds on previous DIPF projects but notably attempts to address their failings, specifically bringing in existing standards for documentation for interoperability and incorporating existing tools. So, to illustrate, a DDI-based questionnaire can be created using (Qbee – Questionnaire Builder). The questionnaire can be translated (OLT, predecessor of Translation Builder), delivered to interviewers (PIAAC CAPI System, predecessor of Case Builder Administrator), produce an overview of paradata (PIAAC CAPI System, predecessor of Case Builder Administrator) and finally, dissemination the data with a portal solution (NEPS Portal – predecessor of Rogatus Portal). Currently funded to the end of 2014, Rogatus will add additional functionality with DDI 3.2 compliance, Blaise and Nipo import/export, additional CAWI server (web-based delivery), a web template for mobile clients, and obile client for sampling.
The session then moved away from tools to policies and training. Irena Vipavc Brvar from the Slovenian Social Science Data Archive (ADP) spoke on the topic of data management planning, producing an overview of what it is and where it is increasingly required. Included in her presentation was an emergent theme of the session, that researcher credit for sharing data was an important incentive to implementing effective data management. She revealed that in Slovenia, archiving research data is seen as equivalent to a publication in terms of an output and therefore provides professional credit towards a researcher’s promotion record. Yet journals emerge as an issue. Although more journals adopt data policies that require data underpinning publications to be available for reuse, this is more a natural sciences phenomenon than social science. Vipavc Brvar quoted a figure from Savage & Vickers (2009) that despite requirements only one in ten datasets underpinning a publication were actually made available to researchers.
Bielefeld University’s Stefan Friedhoff from the “Information and Data Infrastructure” (INF) project spoke on wanting to provide well documented, harmonized data across quantitative, qualitative, and mixed-methods data. His presentation identified three types of problem compromising this: methodological, granularity, and acceptance.
Methodologically qualitative social research is underdeveloped in research data management compared to its quantitative brother. The basic problems are a non-standardized approach with the qualitative research lifecycle being non-linear, and the qualitative field containing different concepts and types of “data”. Their solution is to develop controlled vocabularies based on project documentation and gather information on previous documentation strategies to improve standards (DDI based). Granularity concerns the problem of when is enough…enough? How detailed must documentation be to the point where it is sufficient enough to allow someone else to reuse the data. INF addresses this by developing their own selection of relevant elements with which to document data, elements based on researchers’ documentation practices. They combine archive requirements with researcher workflows by adapting to current documentation routines. Finally, as always, is acceptance. The project found many researchers don’t understand advantages of standardized documentation offering excuses on a variant of “I don’t research in a standard way”. Likewise, changing researcher’s workflow is resisted because they have “always worked this way”. Finally, there is an absence of motivation to take data management seriously because “I am a researcher, not an archivist”. Here the project adopts a two way strategy: decrease effort required and increase benefit obtained. So, decrease effort by supporting Virtual Research Environments that semi-automate and are process-driven, combining tools in one place and continuously produce documentation. The benefit is increased with supplying data citation devices and standards to promote the visibility of researchers work. Solutions to all three problems are underpinned with feedback loops where RDM specialists provide initial support tools and improved them based on researcher feedback. We data management advisors should provide useful and usable solutions.
Completing the first session was GESIS’s Alexia Katsanidou, who drew on standards and the importance of publications by comparing the development of documentation standards over a 30 year period using the European Values Survey as a case study. Their paper opened by comparing CESSDA member archive data catalogue fields to identify common standards of metadata fields and comparing these to methodology sections in published papers. The epistemological basis of their paper was built around King’s (1995) assertion that social research must be replicable, and thus could publications be replicated using the methodology sections of an article and a data catalogue entry. Comparing EVS waves shows the amount and sophistication of EVS documentation has grown. The publication picture is not so clear in that the first wave was essentially primary research and not re-use (publications came from EVS PIs) and the latest wave is too soon to see many publication outputs (with data only being available within the past two years). But the third and fourth waves see a difference. In 1990 derived publications the re-use community widened, but no standards for methods sections and documentation existed. For 1999 based publications stable development of replication and documentation standards emerge. However, publications based on EVS waves show a trend to increased use of data citation and treating the EVS dataset itself as a publication.
Part two to follow after a virtual coffee/tea break…