David Kretz and Jeff Tharsen convert Messkataloge to machine-readable “plaintext”

The Center for Interdisciplinary Research on German Literature and Culture supports a recent effort to open a new chapter in German book history (Buchgeschichte) through the resources of UChicago's Digital Humanities community. German national biographies exist for the 16th, 17th, and 18th century. However, only those for the 17th century can lay a claim to near-comprehensiveness, and none exists for the 19th century. What does exist are the Messkataloge, the printed book catalogs created for the Frankfurt and Leipzig book fairs from 1594 to 1860. Until recently, these did not exist in machine-readable digital form. Harnessing today's most advanced Optical Character Recognition (OCR) system Jeff Tharsen, Associate Technology Director for Digital Studies and David Kretz, Ph.D. candidate in German Studies and Social Thought, were able to convert all 65,658 pages of the Messkataloge to machine-readable “plaintext”, achieving over 97% accuracy and resulting in the first digital database of the Messkataloge. This work has opened up exciting new possibilities, as scholars can now search the data for insights into, for example: where books were published during the formative periods of German publishing, the rise and fall of particular genres, of religious writings, the flows of translations, or the declining dominance of Latin, with a precision hitherto unmatched, particularly for the 19th century.