CESSDA Training doesn’t live here anymore…

Please note that we moved our blog to a new page. You’ll now find us at https://cessdatraining.wordpress.com/.

Posted in Uncategorized | Leave a comment

A DASISH workshop on trust and certification

This post is co-written with Claudia Engelhardt, Göttingen State and University Library. It is also published on the DHd-Blog.

On 16-17 October 2014, a workshop on the topic of trusted digital repositories took place in The Hague. The workshop was organised by DASISH, a project striving to increase the overall quality of services for data management, curation and dissemination offered in the five Social Science and Humanities (SSH) Research Infrastructures that have been on the ESFRI Roadmap: CLARIN, CESSDA, DARIAH, ESS, and SHARE.

The workshop’s focus was on the tools and standards for audit and certification comprised in the European Framework for Audit and Certification of Digital Repositories: the Data Seal of Approval (DSA), DIN 31644/nestor Seal, and ISO 16363. Workshop presentations (available on the DASISH webpage) and discussions dealt with getting to know the standards, the conditions for their implementation as well as their current use in the Social Sciences and Humanities.

The European Framework for Audit and Certification was established in 2010 with the objective of better coordinating and structuring the emerging landscape of audit and certification procedures. It defines three levels of certification:

  • basic certification: equivalent to the Data Seal of Approval;
  • extended certification: granted to archives / repositories which have obtained the DSA and successfully underwent an external peer-review based on DIN 31644 or ISO 16363;
  • formal certification: granted to archives / repositories which have obtained the DSA and successfully completed a full external audit based on DIN 31644 or ISO 16363.

In the first part of the workshop, representatives of three European SSH infrastructures – Vigdis Kvalheim from CESSDA, Pavel Straňák from CLARIN and Henk Harmsen from DARIAH-EU – talked about the importance and use of certification standards in the respective infrastructures. In all of them, certification of member data archives and repositories plays a role, with the Data Seal of Approval being the instrument most commonly employed. In CLARIN, one of the requirements for becoming an infrastructure centre or a service providing centre is to undergo certification through the DSA or the MOIMS-RAC approach. CESSDA AS also works towards integrating the DSA into the set of obligations of service providers. In DARIAH-EU certification is one of five short-term goals. One concrete aim in this context is that the repositories that form the backbone of the DARIAH infrastructure have obtained the DSA by 2016.

The ensuing discussion focused on drivers behind the decision to undergo audit and certification:

  • There was consensus in the group that audit and certification are becoming increasingly important to satisfy funder requirements. This is specifically the case for publicly funded institutions, which to receive funding are expected to prove that they are capable of offering high-quality preservation / curation services in accordance with international standards. From this perspective, acquiring certification is equivalent to creating a competitive advantage.
  • However, “self-assurance” was an equally important aspect pointed out by representatives from archives that already underwent an audit / certification procedure, or are planning to do so in the near future. Thus, audits were regarded as an important instrument in determining whether the preservation / curation procedures and workflows of the archive are adequate. Accordingly, audit procedures were used to support the detection of gaps and potential risks.
  • At the same time, there seemed to be consensus that in the SSH community demands from users are currently not a considerable driving force behind the decision to undergo external audit / certification. This could change in the future, especially if the different seals or “badges” are recognized as an indicator of high-quality services by users.

The workshop continued with presentations of the DSA (Paul Trilsbeek) and the nestor Seal (Dr. Christian Keitel) audit standards as well as several case studies from the different ESFRI projects (specifically, LINDAT, DANS, and GESIS). Finally, Barbara Sierman presented the current state of ISO 16363.

The subsequent discussion dealt with the question whether every data service has to be certified, ways of lowering the threshold for entrance into audit and certification, and alternative ways of creating trust – specifically with an eye to smaller data archives or repositories with very limited resources.

  • There was consensus that not every data service needs to be certified. But the decision on whether certification should be pursued or not should not only depend on the available resources of a data service, but also on its nature. As an example, participants referred to the front office-back office model employed in the Netherlands. In this approach, the responsibility for the long-term preservation and availability of research outputs lies with the “back office” organisations (centres with a national scope such as DANS or 3TU.Datacentrum), whereas the “front office” institutions (located at higher education institutions, research institutes etc.) concentrate on communication with and support of data producers and users on a local level. In line with this division of responsibilities, certification is deemed necessary only for the “back office” organisations.
  • In terms of the effort required, the DSA was seen as a suitable “entrance point” to certification even for smaller institutions. Among the data archives that already obtained the DSA are also one person archives, which shows that the DSA audit procedure is doable even with limited human and financial resources. It was also noted that to a certain extent the necessary time and resources are a question of scale. While bigger archives have more resources, their size also makes the process of documenting procedures and of creating required policies more time-consuming.
  • The group also discussed measures for creating trust that can be undertaken independently from certification. It was deemed very important to enable users to do their own “trust checks” on object level and thereby evaluate themselves if a digital object is authentic or not. To make this possible, archives have to engage in transparent communication with their designated community. Another important point in this context is the careful consideration of the significant properties of the information objects to be preserved and their adequacy to the needs of the user community.

Overall, the discussions showed that although the preservation landscape in the SSH domain is moving towards more standardisation and greater homogeneity with regard to audit and certification, it is neither necessary nor desirable to tar all archives with the same brush. Thus tiered or multi-level approaches such as the Dutch front office-back office model or the European Framework for Audit and Certification make it possible to achieve standardisation without losing sight of scale and archive- or discipline-specific requirements.

A report about the workshop from a participant’s point of view is available on the blog “Bits & Pieces. Digital Preservation at Edinburgh University”: report of day 1, report of day 2.

Posted in CESSDA, DASISH, Data infrastructure, Workshops | Leave a comment

Destination: CESSDA Training

As you may have noticed here and there, some developments have been going on behind the scenes at the Archive and Data Management Training Center. Since the official launch of CESSDA AS in Bergen on December 5th, 2013, we have been busy discussing and determining our future role in this new pan-European research data infrastructure.

We are very excited to announce that we are now officially the training provider and coordinator for all CESSDA trainings and will offer our services under the name of “CESSDA Training”.

The Archive and Data Management Training Center is dead! Long live CESSDA Training

CESSDA Training will continue to be hosted by GESIS – Leibniz Institute for the Social Sciences. We will continue to offer our introductory digital preservation (see here and here) and research data management (RDM) workshops (see here and here) as well as consulting activity in these fields.

But on top of this, there are exciting plans for new activities and services that will be implemented in early autumn 2014. CESSDA Training will focus on three types of work:

  1. Digital preservation training for all CESSDA members promoting integrating activities for younger archives;
  2. Data management training for all academic audiences across CESSDA countries and beyond, and
  3. Promoting training activities across the CESSDA partners, to make sure no training activity passes unnoticed.

In the coming weeks and months, watch out for the following changes to our web presence and communication channels:

  • In the future, we will tweet as @CESSDAtraining. If you already follow us under @archivetraining, you will continue to receive our tweets under the new name.
  • This blog will be incorporated into the CESSDA Blog. We will post a proper announcement and a link here, once the new blog has been established.
  • The content of our webpage will move to the CESSDA page.

There are exciting times ahead – we hope you’ll join us for the ride!

Image: Twisted by Beyond Neon (cc-by)

Image: Twisted by Beyond Neon (cc-by)

Posted in CESSDA, Training, Workshops | Leave a comment

The long journey of the research penny

National governments and ministries of research and education have an interest to see their money travel far, and not only as far as the unexplored edges of science. The real target is eternity. To achieve eternity, research funding bodies across the globe show an increased awareness of the importance of sustainable research, including data management, data sharing and open access. Among the benefits of these practices we count that data are preserved “for ever”, research is conducted in a transparent way, has high quality standards, and therefore allows for future discoveries to be built on it. As close to scientific eternity as it gets. Thus, the main motivation is to ensure the availability of scientifically collected data for secondary use.

Lunar Roving Vehicle (LRV) during the Apollo 15 mission, July, 1971 (CC0)

Lunar Roving Vehicle (LRV) during the Apollo 15 mission, July, 1971 (CC0)

The increased number of data collected by researchers across disciplines together with the development of research methodology boosts the possible levels of interdisciplinary work. Old data can be used in many new different ways helping to increase the value tax payers get for their money, an argument increasing the impact of research funding for many governments.

National research funders have responded to this trend by putting together data policies. The International Federation of Data Organizations (IFDO), consisting of data archives and infrastructure institutes in the social sciences, surveyed their own members in 2013 with the purpose of collecting information on current national data policies. The survey focused mainly on data policies of key social sciences research funders in the respondents’ country of operation. Information was collected through a web survey to which 43 individuals from 32 countries responded, of which 18 are European and 10 ‘non-Western’ countries.

The results show that there is a distinctive movement towards formalizing the above described developments in clearly formulated data policies. These policies vary across countries in strength, precision, level of obligation, and support for implementation. In most countries there is a general data sharing requirement in place, while some take a step further and oblige researchers to follow open data and data preservation standards.

These developments are certainly moving in the right direction. However, what also became clear from the IFDO web-survey is that these data policies do not always come with a detailed explanation of their expectations or how they should be fulfilled. They remain vague, leaving it to the researchers to interpret the policy and act accordingly. Even in cases where data sharing is stated as obligatory it is not always coupled with data management support throughout the research cycle. Only very few countries seem to offer the full service, namely, USA, UK, Canada, Australia Finland and Switzerland. To ensure success of such policies, research infrastructures must be in place to take the load off the researchers’ shoulders.

Most European countries that do not oblige their researchers to adhere to their data policies offer established data infrastructure institutions and a long tradition in data sharing and secondary data use. The ministries of research and education have invested in data sharing to bring data sharing infrastructures to a high standard – however, without developing the policy side of things. On the operative level, they do all it takes to ensure high quality data management and secondary use, but on the policy level they are not there yet. Thus, even though the infrastructure is there, the policy side is missing. This is a discrepancy, which does not allow for optimization as it does not ensure the survival of all research data. Only those researchers already convinced of the value of data sharing use the infrastructures.

To achieve the longest journey of tax payers’ money in research we need to have both things in place: supporting and enforcing institutions and high-level data policies. As the IFDO summary report concludes:

“The future success of efforts in this area relies on the ability of policy makers and funders to move from high-policy statements to policy enforcements and monitoring and from short-term funding to long-term funding and institutional models that build trust and confidence” [1].

More information in www.ifdo.org


[1] Kvalheim, V. & Kvamme, T., 2014. Policies for Sharing Research Data in Social Sciences and Humanities. A survey about research funders’ data policies, p. 36. Available at: http://ifdo.org/wordpress/wp-content/uploads/2014/04/ifdo_survey_report.pdf

Posted in Data sharing, RDM policies, Research data management | Leave a comment

Walking in our shoes! Training the trainers in digital preservation

This blog is co-written with our former colleague Laurence Horton (London School of Economics and Political Science). It is based on a workshop presented at Archiving 2014, Berlin and IASSIST 2014, Toronto.

The CESSDA Archive and Data Management Training Center was established in mid-2011 to promote awareness of digital preservation roles and responsibilities and produce more digital preservation experts. One aspect of its activity is a “First steps towards Digital Preservation” course designed for small groups of individuals either beginning to undertake or charged with digital preservation responsibilities.

The course assumes participants have no prior knowledge of digital preservation. Conceptually, it uses the “three-legged stool” approach developed for University of Cornell digital preservation management workshop and tutorials, where the three legs of a stool act as metaphors for technological infrastructure, organisational infrastructure, and requisite resources. All three are integral, related components. The stool would collapse if one leg was missing or defective. Likewise, digital preservation strategies fail — or to stick with the metaphor, you’d be sat on the floor — if one of your infrastructure or resource legs is missing or faulty. However, within the course there is an emphasis on the organisational component over, but not to the detriment of, the technological and resource components. There are two reasons why. First, the organisational dimension tends to be neglected or an afterthought once a digital preservation system or service is already established. Second, although each of the legs of the stool is indispensable, the organisational focus on “policies, procedures, practices, people” provides a framework for digital preservation: defining goals, operating conditions, specifying limitations, and determining procedures for day-to-day routines and non-routine emergency situations.

In designing the course we were aware of the need for flexibility in how it could be taught. Six modules were constructed. Each one is based on a presentation to acquaint participants with the topic, its key messages, and related issues. These are followed (except the introduction) by an exercise designed for reinforcement, self-assessment, or further discussion. For instructors, each module comes with a synopsis of its content, a set of learning objectives, full notes for the presentation slides, and bibliographies for further reading.

Around two hours is required to cover the content of a module. Consequently, the course can be either taught intensively over two days or spread out over a longer period. Modules can be used as standalone resources, although greater value can be extracted if subsequent modules are based on an introduction or basic knowledge of the Open Archival Information System (OAIS) reference model.

  1. What is digital preservation?
    Introduces digital preservation on a conceptual level, including an outline of key digital preservation terminology. The BBC’s Domesday Book project is used as a case study to illustrate challenges involved in digital preservation.
  2. Open Archival Information System (OAIS) reference model
    The concepts and terminology of the OAIS model are introduced with emphasis on the functional and information model and the concept of Preservation Description Information.
  3. Designated communities
    Defining who you are preserving objects for, establishing their needs and identifying strategies to meet them. Participants are prompted to think about who would use objects from their work and how at 5, 20, and 100 years in the future.
  4. Policies
    Looking at organisational documents necessary to guide digital preservation, how they relate to each other, and their importance as a communication tool. This includes acquisition or collection, preservation, dissemination, and continuity policies. The exercise invites participants to undertake and discuss an acquisitions policy self-assessment.
  5. Licencing
    Presenting the concept of Intellectual Property Rights and illustrating how they affect preservation and re-use. The session looks at licensing as a tool to protect intellectual property in preservation and re-use, restrictions on access and re-use, and issues of enforceability and attribution stacking. A hypothetical scenario asks participants to review a data submission in the context of possible licencing issues.
  6. Trusted digital repositories
    Focusing on the concept of “trust” both within and outside an organisation and its importance to digital preservation, this session goes on to introduce recognised standards of trustworthiness archives and repositories can use to build trust with designated communities and peer organisations.

All course presentations, notes, and exercises are available for free under a CC-BY (3.0) licence from GESIS’s online learning platform (registration required). Readers are encouraged to use and adapt contents (with attribution) to either find out how digital preservation is their job or go forward helping ensure there are more digital preservation experts.

Posted in Training, Workshops | 1 Comment

RDM? Ja bitte! German Rectors’ Conference actively supports Research Data Management

recommendation published by the German Rectors‘ Conference (Hochschulrektorenkonferenz, HRK) asserts that German Higher Education Institutions (HEIs) should take an active role in systematically addressing the challenge of research data management [1]. An association of currently 268 HEIs serving 94% of the German student body,

“[t]he HRK is the political and public voice of universities and provides a forum for the process of forming joint policies and practices. The HRK addresses all manner of topics related to universities: research, teaching and learning, continuing professional education for academics, knowledge and technology transfer, international cooperation, and administrative self-management” [2].

This was not unexpected as over time German HEIs have demonstrated a growing awareness of the importance of RDM, and many universities have already acted upon it. Leading institutions such as, for example, Bielefeld University, Humboldt-University Berlin, Christian Albrechts University Kiel and University of Mannheim have begun to build up services to promote and support the systematic management of research data in their institutions [3][4].

The HRK document bases its recommendation on current trends in research and research data production (e.g. big data, increasing heterogeneity of data, new methods and research processes). A central place is held by the acknowledgement that the implementation of efficient RDM practices is an important competitive factor for universities (see p. 3). The HRK therefore recommends that university leadership “establish a foundation for scientific (research) work by taking responsibility to create an environment which enables the researchers in their institution and German science in general to manage digital data efficiently, easily, and on a secure legal basis” (p. 3, our translation). To achieve this end the HRK makes the following high-level recommendations for universities:

  1. Create a common principle and guideline for the management of research data in a process involving all relevant stakeholders (see p. 4).
  2. Cooperation between HEIs, non-university research institutions and discipline-specific infrastructures to facilitate cooperative research and data management across institutions (ibid.).
  3. Promote information literacy (of which RDM skills are considered a subset) among scientists (ibid.).
  4. Create an organizational and technological infrastructure in each institution to support RDM throughout the entire lifecycle of research data (see p. 5).

While the HRK recommendation rightly asserts that the responsibility to actively manage data lies with the individual researchers, it leaves no doubt that it is HEIs, and more particularly, university leadership, who is responsible for creating an environment in which this is not only possible, but also fostered.  It remains to be seen in which way this high-level recommendation, which is to be complemented by concrete suggestions for the development of RDM in German HEIs (see p. 5), will be translated into action. But it is an important signal – to researchers, who face increasingly strict requirements to make their data re-usable and available for replication, and to make their data management practices transparent; and to German research funders, who have only just begun to formulate stricter data policies and requirements for data management plans. Institutional support from the universities will be very welcome.


[1] HRK Hochschulrektorenkonferenz, 2014. Empfehlung der 16. Mitgliederversammlung der HRK am 13. Mai 2014 in Frankfurt am Main. Management von Forschungsdaten – eine zentrale strategische Herausforderung für Hochschulleitungen, Available at: http://www.hrk.de/uploads/tx_szconvention/HRK_Empfehlung_Forschungsdaten_13052014_01.pdf. Accessed: 20 May 2014

[2] HRK, n.d. About the HRK. Available at: http://www.hrk.de/hrk-at-a-glance/. Accessed: 20 May 2014

[3] Burger, M. et al., 2013. Forschungsdatenmanagement an Hochschulen: Internationaler Überblick und Aspekte eines Konzepts für die Humboldt-Universität zu Berlin. Available at: http://nbn-resolving.de/urn:nbn:de:kobv:11-100210226. Accessed: 20 May 2014

[4] Kindling, M., Schirmbacher, P. & Simukovic, E., 2013. Forschungsdatenmanagement an Hochschulen: Das Beispiel der Humboldt-Universität zu Berlin. LIBREAS. Library Ideas, (23), pp.43–63. Available at: http://libreas.eu/ausgabe23/07kindling/. Accessed: 20 May 2014

Posted in Data infrastructure, RDM policies, Research data management | Leave a comment