Historical research often involves working with highly diverse and complex source materials, ranging from handwritten manuscripts to large, heterogeneous document collections. Machine learning methods are increasingly shaping how historians work with digitised sources, particularly through Automatic Text Recognition (ATR). In this talk, Jonas Widmer and Dana Meyer will introduce The FLOW, a modular, microservice-based framework designed to support machine learning–driven data management and processing in the Digital Humanities.
The talk will outline how The FLOW separates complex ATR workflows such as pre-processing, model training, inference, and evaluation into independent, reusable components that can be combined flexibly and accessed without programming experience. Using state-of-the-art transformer-based models, the project aims to make advanced text recognition workflows more transparent, reproducible, and scalable across diverse historical datasets.
Jonas and Dana will outline a typical FLOW workflow, showing how datasets are managed on the Hugging Face platform and then processed step by step. The focus will be on how such workflows can support everyday research practices when working with large and heterogeneous historical corpora.
Speaker Biographies
Jonas Widmer is a Research Software Engineer specialising in Digital Humanities at the University of Bern. In this role, he assists in planning and developing projects focused on Natural Language Processing. His primary interest lies in Handwritten Text Recognition (HTR), where he engages with historical projects and their diverse sources.
Dana Meyer is a Master’s student in Intelligent Interactive Systems at Bielefeld University and works as a research assistant on the project The Flow in the Digital History group at Bielefeld University
Jonas Widmer
Dana Meyer
Bodleian Bytes
Bodleian Bytes is a series of online talks hosted by the Centre for Digital Scholarship at the Bodleian Libraries. The series engages with innovative national and international research in digital scholarship. It is a virtual space for discussions surrounding different tools and methodologies whilst also providing inspiration for future digital research.