Last week Matteo Romanello and myself visited Marco Büchler and his colleagues in the eTraces project, based in Leipzig. We were introduced to the fascinating work on the interface between computer science and the Humanities done by the eTraces team and also got the opportunity to present our own work. The slides of my presentation can be found on my bibliography page or on Scribd.
Here some info about the eTraces project:
Be it science or the everyday life – our language contains numerous trails of our cultural legacy in the form of winged words and quotations. Scientists now created new software tools for making this cultural legacy available in digital libraries. With the help of those programs the origin and dissemination of text passages, quotes and common phrases can be reconstructed in a quick and easy manner.
The main focus of “eTRACES” (which is the name of the project) lies on temporal traces and interconnecting relations of text passages in German language novels from between 1500 and 1900, as well as social science texts created since 1909. Project partners are the chair for Natural Language Processing at the University of Leipzig (ASV), the Göttingen Centre for Digital Humanities (GCDH), as well as the GESIS – Leibniz-Institute for the Social Sciences in Bonn.
Funding of about 1.2 Mio € is granted by the Federal Ministry of Education and Research (BMBF), covering a period of three years. “The cooperation of computer science experts with the specialists from the humanities and the social sciences bears great potential for the advancement of all three disciplines” explains State Secretary Cornelia Quennet-Thielen from the BMBF. Further, she emphasized, how eTRACES did exemplarily implement the “Recommendations on Research Infrastructure for the Humanities and Social Sciences” that were recently issued by the science council.
Harnessing the latest methods in text mining, new methods should be developed and tested to determine the demarcation of the intentional re-use of a text passage and and its utilization as a commonly used text block or boilerplate. Further attention lies on analyzing and visualizing the geographical, temporal and semantic cross-linking of citations.
In the application in the literary studies (partner Göttingen) the central question is, which practices of text passage re-use did coin the history of german novels. The initial subject of the research interests is the Luther Bible. The pivotal question posed by the partner GESIS is the examination of a textual differentiation of qualitative and quantitative social research. The application of informatics (ASV Leipzig) is to utilize the information on text re-use to build a search engine that also considers the citation frequency of a text or text fragment to determine its relevancy.