Barcelona was the venue for the first EUDAT conference, and we were there. EUDAT for those that don’t know is attempting to bring together data infrastructure providers and practitioners to address the challenges and provide solutions to problems of interoperability, cross-disciplinary, and cross-border collaborative research projects.
What we found in Barcelona was, as promised, a wide ranging audience. However, what was disappointing from our perspective was how underrepresented the social sciences were. Aside from ourselves, only DASISH (which is partly social science) was there as a social science infrastructure (and that was mostly in a technical orientated capacity).
Don’t worry, we did not feel lonely. It is insightful to see what other disciplines are doing in collecting data and sharing, plus there was plenty of good food and wine to facilitate conversation and socializing. However, there was a feeling of estrangement from what we think it means to share data compared to others. For natural scientists the idea is to share with all (open data), or better still crowdsourcing data collection or analysis (for example, the SETI project). We could never do that, and if you are wondering why, the answer is because we mostly deal with human subjects who, unlike the stars in the sky, tend to be a bit sensitive about seeing information about their lives on display for all.
One other disjunction concerned a project where Google pair-up with partners who pay for part of the research and then bring that to a wider audience. While the project was well received, we felt a sense of discomfort. Firstly, with the ethics of selling data from public funding to subsidize corporate profit but also because nobody was talking about implications of data protection laws, intellectual property rights, or urging caution over the future sensitivity of data – for example the ability to cross-link data in ever elaborate and complex ways.
The thought occurred to us that this was long-term preservation, but only in terms of short-term usage. The only area there appeared to be long term thought is in storage capacity, but then only in terms of bytes, not necessarily a critical issue like accessibility. Maybe it’s a reflection of the behavior of disciplines that release data as soon as it is collected, as opposed to us social scientists who for good reasons and bad, tend to sit on data a while before sharing, and then often through an archive or some similar dissemination service. However, social sciences have done a lot of work on sharing data in ways that make sense: providing context and meaning through structured metadata (DDI), working towards persistent identifiers for data citation (DOIs). All things that not only assist preservation, but help ensure data can be reused responsibly in the future. Yet nobody was present there to represent this work, apart from us.
So, in conclusion, we need to make our voice heard in these grand movements towards collaborative data infrastructures, because we have a lot to say and it’s worth saying.