Changing horizons from data to capta: plurality in humanities research infrastructures


Logical positivism, constructive empiricism, explanatory model, verificationism and falsifiability are just some of the terms that define methodological discussions in the humanistic sciences. A core argument of which regards the description of phenomena. Where research into e.g. linguistics and socio-economic history has often adopted an inherently positivist approach that attempts to explain concepts based on empirically verifiable data, searches for (large) patterns and generalized models, others – like political history, cultural studies and theology – follow a more hermeneutic approach in which the subject – the scholar – interprets a phenomenon in its context. Events are to be considered with a changing horizon (Gadamer) and explanations differ by subject and point of view. Without having to delve too deeply into the theoretical recesses of these debates it becomes clear that these methodological approaches are – although not mutually exclusive – in stark contrast with each other.

Many studies in literature – and typically among those within the COST-WWIH IS0901 Action – can be considered part of the hermeneutic tradition. Examples are the debates on the nature of literary reception documents, which focus on the context of feminine networks; and distribution studies of national literatures placed in a geo-temporal setting. Mapping a network of authors and literary works across changing international borders with shifting definitions of nationality, gender and ethnic background results almost automatically in a large number of differing views among scholars. Often, these assessments are incompatible at best and contradictory at worst.

How do these methodological discussions influence the technical design of research data infrastructures? Information technology in general favours the positivist approach and does not cope well with varying interpretations. It marvels in discovering pattern laws in ‘big data’-sets of both linguists and socio-economical historians, and naturally assumes that an independent ‘true’ state of a (network of) object(s) can be determined and stored in a database. Consequently technologists in the digital humanities aim for (meta-)data standardization, the construction of ontologies and establishment of vocabularies. However, this singular positivist modelling approach pushes research towards a pattern-based causal determinism that does not accommodate the often sparse, conflicting and heterogeneous character of data in the humanistic sciences.

Illustrated with examples from literary reception records, this paper focuses on plurality in research data infrastructure. Through inclusion of ‘computable hermeneutics’ it bridges the methodological divide in humanities research. The paper first sets out to describe the requirements Huygens ING has defined for a central generic research data-repository that allows scholars to remodel subsets of data in order to capture their precise definitions, interpretation and ideas. Peter Checkland introduced the term ‘capta’ for these subsets, which was later expanded to the Digital Humanities by Johanna Drucker. The paper subsequently compares these requirements with the generic rdf standard for the semantic web and the CLARIN standards cmdi/isocat. It will show that by embracing hermeneutic poly-interpretation, a research infrastructure can be developed that steps beyond the detection of patterns towards an environment that assists scholars in giving meaning and value to detected phenomena.

