Monday, November 9 was my first visit to the University of Leipzig Augustusplatz campus. The Digital Humanities chair and AKSW are on the 6th and 9th floors of the sharp, new Paulinum building, erected on the site of the medieval Paulinerkirche, demolished in 1968.
I spent some time catching up with Maxim Romanov, a digital humanist and Middle East historian whom I’ve know for quite a few years. Maxim’s great methodological insight is to use text mining techniques on the vast digital corpus of texts from the Islamic scholarly tradition that was produced by many laboring hands in keyboarding sweatshops in the Gulf during the 1990s and the 2000s. Maxim is the first digital historian to treat these texts as the medieval databases that they are. His dissertation (Michigan 2013) was an analysis of al-Dhahabi’s 14th century biographical dictionary Tarikh al-Islam. He developed data-mining techniques to parse its 30,000 entries, and his analysis paid particular attention to the ways that centers of scholarly influence shifted over time. He’s posted some fine visualizations of that material on his excellent website.
After completing his PhD, Maxim worked with Gregory Crane at Tufts on the Perseus project, an advanced and multi-faceted classical library. Maxim’s work expanded in many directions, as his quite stunning directory of projects makes clear. Digital scholarship is more advance in classics than in any other field of the humanities, largely because scholars facing the problem of scarcity must do more to squeeze every last drop of significance from their sources. Already in the 1980s and 1990s, classicists were developing comprehensive digital corpora unlike anything available in other fields. Thus they had ready the foundation of a digital technology stack as digital methods became widely adopted in the last decade. Projects such as Perseus and Pelagios were thus able to make rapid progress.
In the field of Islamic Studies, medievalists such as Maxim have recognized the structural similarities between their field and “mainstream” classics. This recognition has gone both ways, as classicists working with Greek and Latin sources have recognized the dividends that scholars of Arabic, Syriac, and other Middle Eastern languages can offer to their own field. So mapping projects such as Recogito begin to include Arabic and even Chinese sources, and the patrons of digital humanities see the value of language-based portals like Syriaca which will support the development of robust corpora layers on which future research can build. And so it is no surprise that Maxim’s rare expertise in Islamic digital humanities has attracted the attention and support of Professor Crane’s chair at Leipzig.
The Arabic (and Persian) corpora that Maxim is using is not of the same quality or accuracy as those of the classicists; the most significant difference is the utter lack of metadata in the Islamic texts. It will take many years of editing and correction, by many people, before this is the case. But Maxim is working on tools to assist in that process, and he is doing excellent work with the texts he already has. During my visit, he showed me a tool that works rather like Turn It In by checking for passages in one text that occur in other texts. He has run some 50,000 works through this tool, and with the results he can show quite precisely how Islamic scholars used source material in their own compositions. This data has great significance for the mapping of scholarly networks, which is one of Maxim’s great interests. Previously, only explicit connections between scholars (as student and teacher, for example) were detectable. With this new analysis, we can better understand intellectual and not merely personal connections between these thinkers.